International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 10 Issue: 04 | Apr 2023
p-ISSN: 2395-0072
www.irjet.net
Indian Sign Language Recognition using Vision Transformer based Convolutional Neural Network Sunil G. Deshmukh1, Shekhar M. Jagade2 1Department of Electronics and Computer Engineering, Maharashtra Institute of Technology, Aurangabad
431010, Maharashtra, India
2Department of Electronics and Telecommunication, Sri N B Navale Sinhgad College of Engineering, Solapur-
413255, Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Hand gesture recognition (HGR) is a popular
that can identify, analyze, and communicate information based on specific motions [1].
issue in the areas of learning algorithms and visual recognition. Certain Human-Computer Interactions technologies also need HGR. Traditional machine learning techniques and intricate convolutional neural networks (CNN) have been employed to HGR up till now. Despite the fact that these approaches work adequately well on HGR, we employed a more modern model, vision transformer, in this research. Vision Transformer (ViT) is created to enhance CNN's performance. ViT has a strong similarity to CNN, but its classification work utilizes distinct layers. The ViT was performed to gesture datasets via learning algorithms. In trials, a testing dataset having crossed test strategy is evaluated, and classification accuracy is employed as productivity metric. According to test findings, the presented approach framework attains an achieved accuracy of 99.88% on the image database used, which is considerably higher than the state-of-the-art. The ablation study also supports the claim that the convolutional encoding increases accuracy on HGR.
Techniques for recognizing gestures on the basis of contactless visual inspection are now prominent. This is because their accessibility and price. Hand gestures are an expressive communication technique employed in the healthcare, entertainment, and educational sectors of the economy, as well as to assist people with special needs and the elderly. For the purpose of identifying hand gestures, hand tracking which combines a number of computer vision operations such as hand segmentation, detection, and tracking is crucial. HGR are used to convey information or emotions in sign language to those who have hearing loss. The major problem is that the typical person may easily misinterpret the message. AI and computer vision developments may be utilized to identify and comprehend sign language. With the assistance of modern technology, the typical person may learn to recognize sign language. This article introduces a deep learning-based technique for hand gesture identification. In this regard, operating the system remotely necessitates the use of gestures. The devices record human motions and recognize them as the ones that are used to control them. The movements employ a variety of modes, including static and dynamic modes. The static gestures are maintained while the dynamic gestures shift to various areas when the machine is being controlled. So, rather than using static gestures, it is essential to identify or recognize dynamic motions. The camera that is connected to the apparatus initially captures people's movements [2].
Key Words: Convolutional neural network, Vision transformer (ViT), Transfer learning, Training of images, Accuracy, Hand gesture, Human-computer interaction (HCI).
1. INTRODUCTION Direct contact is becoming the most popular way for users and machines to communicate. People connect with one another naturally and intuitively through contactless techniques including sound as well as body actions. The versatility and efficacy of these contactless communication techniques have motivated several researchers to adopting them to further HCI. Gestures are a significant part of human language and an essential non-contact communication technique. Wearable data gloves were frequently used in the past to grab the positions and angles of every user's joints as they moved.
When the backdrop of any motions that were identified is removed, the gesture's foreground is gathered. To locate and eliminate the noises in the foreground gesture, filtering techniques are applied. These noise-removed gestures are compared to pre-stored and taught movements in order to verify the meaning of the gestures. The automotive and consumer electronics sectors utilize a gesture-based machine operating system that doesn't require any human input. Static and dynamic gestures, as well as online and offline actions, can all be classified as human gestures. The machine's icons can be changed using offline gestures, but
The complexity and price of a worn sensor have limited the extensive application of such a technology. Gesture recognition refers to a computer's ability to understand gestures and execute specified instructions in response to such motions. The main goal of HGR is to establish a system
© 2023, IRJET
|
Impact Factor value: 8.226
|
ISO 9001:2008 Certified Journal
|
Page 1154