Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 12 Issue: 10 | Oct 2025

p-ISSN: 2395-0072

www.irjet.net

A Vision-Based Framework For Static ISL Gesture Recognition in Human-Machine Interaction Miss. Aparna Patil1, Dr. Kishor Pandyaji2, Prof. Manoj Chavan3 1Student, 2Professor, 3Professor

Département Of Electronics Engineering Dr. V.P.S.S.Ms Padmabhooshan Vasantraodada Patil Institute of Technology, Budhgaon, Sangli, Maharashtra 416304 India. ---------------------------------------------------------------------***--------------------------------------------------------------------instrumented gloves, offer high precision but are often Abstract - Bridging the communication divide for individuals with hearing and speech impairments requires innovative technological solutions. This research introduces a costeffective, vision-based framework designed to interpret static gestures from Indian Sign Language (ISL) and convert them into textual and auditory outputs. The proposed system circumvents the need for high-end hardware by utilizing a conventional webcam for visual input. The core processing involves a multi-stage image analysis strategy: hand region isolation through skin color segmentation in the YCrCb color space, image binarization via Otsu's method, and feature characterization using centroid-radial distance metrics and the Convex Hull technique. A Learning Vector Quantization (LVQ) neural network is employed for classification, selected for its proficiency in managing complex pattern recognition tasks with overlapping class distributions. The model was trained on a custom dataset encompassing all 26 ISL alphabets and achieved reliable real-time performance. The system delivers a dual-mode output—displaying recognized text and generating synthetic speech—thereby facilitating bidirectional communication. This work underscores the viability of a software-driven approach to develop accessible assistive technologies that enhance social participation for the deaf and mute community.

costly, cumbersome, and socially stigmatizing. Conversely, vision-based techniques provide a non-contact, userfriendly, and economically feasible alternative by using cameras to capture and analyze gestures. This paper aligns with the vision-based paradigm, presenting a dedicated system for recognizing static ISL gestures. The proposed framework is built on a multi-stage image processing workflow. It begins with image capture via a standard webcam, followed by robust hand segmentation in the YCrCb color space to mitigate lighting variations. Key shape features are then extracted using the Convex Hull algorithm and radial distance calculations, forming a descriptive feature vector. This vector is classified by an LVQ neural network, chosen for its effectiveness in scenarios with complex decision boundaries. The final output is presented as both on-screen text and computer-generated speech. The principal contribution of this work is the development and implementation of a practical, software-centric tool that accurately deciphers ISL alphabets. By providing a seamless translation from gesture to speech and text, this system aims to empower users, thereby promoting inclusivity and autonomy. Subsequent sections discuss prior research, the detailed methodology, experimental outcomes, and concluding remarks.

Key Words: Indian Sign Language (ISL), Static Gesture Recognition, Convex Hull, Learning Vector Quantization (LVQ), Image Analysis, Human-Machine Interaction (HMI), Assistive Technology.

2. LITERATURE SURVEY The domain of sign language recognition has evolved significantly, with a clear shift from hardware-dependent to vision-based solutions. Early systems leveraged data gloves [4, 5], which, despite their accuracy, were impractical for daily use due to cost and obtrusiveness. The advent of visionbased methods marked a pivotal turn, focusing on using cameras for a more natural user experience. A critical step in vision-based recognition is the reliable segmentation of the hand from the background. Researchers have experimented with various color models like RGB, HSV, and YCbCr to create robust skin color models for this purpose [6, 7]. For converting the segmented region into a binary image, Otsu's global thresholding technique [8] has been widely adopted for its efficiency, and it is integrated into our pipeline. Feature extraction is paramount for distinguishing between different gestures. Techniques such as Principal

1. INTRODUCTION Non-verbal communication through hand gestures offers a rich and intuitive channel for human interaction, serving as a critical medium for individuals with hearing and speech disabilities. The automation of gesture interpretation holds immense potential for developing assistive technologies that can significantly improve quality of life and social integration. In India, where millions are affected by hearing loss, such systems can act as a vital conduit to the broader society, reducing dependency on human interpreters and fostering independence. Research in automated sign language recognition has historically followed two paths: device-based and visionbased approaches. Device-based methods, such as

Impact Factor value: 8.315

ISO 9001:2008 Certified Journal

Page 777