International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 13 Issue: 02 | Feb 2026
p-ISSN: 2395-0072
www.irjet.net
Lyra: A Multimodal Intelligent Desktop Interaction System Siya Vaity1, Prachi Das2, Shrot Maurya3,Manthan Bid4 1 Artificial Intelligence and Machine Learning, Viva Institute Of Technology 2 Artificial Intelligence and Machine Learning, Viva Institute Of Technology 3
Artificial Intelligence and Machine Learning, Viva Institute Of Technology
4Artificial Intelligence and Machine Learning, Viva Institute Of Technology
---------------------------------------------------------------------***-------------------------------------------------------------------- Safety and stability systems to avoid Abstract - Human-computer interface is taking a new direction towards more natural and touchless interface. In this unintentional actions of the system. paper, the author describes the Lyra, the intelligent desktop assistant that combines the voice recognition and real-time hand gesture control to allow operating the computer without hands. The system enables users to open applications, browse, handle files, type, manage windows and carry out mouse actions by voice command and gesture detected by the webcams. Lyra is a blend of speech to speech processing, command interpretation, computer vision and system automation as a means to develop an effective and convenient interaction model. The suggested system will enhance the productivity, user ease of access due to their physical disabilities, and facilitate hygienic touchless computing conditions. Real world experimental use has shown to have smooth cursor control, command recognition and low system response.
1.1 Voice Assistant Module This module works with speech recognition and automation of the system. Components:
1.2 Gesture Control Module
Key Words: Voice Assistant, Gesture Recognition, Human– Computer Interaction, Computer Vision, Automation, AI Assistant
Hand tracking and gesture recognition This module is a camera-based application that tracks hands and recognizes gestures. Components:
1. INTRODUCTION Traditional computer interface relies on keyboards and pointers. These tools are effective but constrain the natural communication process and they might not fit in the accessibility-oriented or touchless environments. The recent developments in the field of Artificial Intelligence (AI), speech recognition, and computer vision have allowed more intuitive interaction mechanisms. This paper presents the concept of Lyra, a multi-modal AI assistant integrating the use of voice commands with the use of gestures to manage a desktop system. This is aimed at offering an uninterrupted, smart and natural interface that minimizes hardware input device physical addiction.
Webcam capture system Hand landmark identification algorithm. Logic of gesture classification. Layers of web socket communication. System action executor using Python. The two modules communicate with the operating system to run command on-the-fly
2. Voice Assistant Functionalities 2.1 Wake and Sleep Mechanism Lyra starts with preprogrammed wake words and sleeps after not being used, which maximizes consumption.
The key contributions of this work are:
2.2 Website and Application Control.
Desktop automation through voice. A hand gesture control mouse and window control system in real-time. Combination of speech and vision modules to one assistant.
© 2026, IRJET
|
Impact Factor value: 8.315
Speech recognition engine Matching and interpretation unit of command. Web launcher and application. Search module of files and folders. Automated typing controller. Filter to avoid interference of critical systems.
Users can also open the installed applications and websites through the natural voice commands.
|
ISO 9001:2008 Certified Journal
|
Page 669