International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 12 Issue: 10 | Oct 2025
p-ISSN: 2395-0072
www.irjet.net
Real-Time Voice-Based Emotion Recognition with Bhagavad Gita Mentoring Shubham B. Khaire1, Amol M. Chinchole2, Jayesh M. Borse3, Prafulla P. Chaudhari4 1Mr. Shubham Bhausaheb Khaire, Student, MCA, MET Institute of Management, Nashik, Maharashtra, India 2Mr. Amol Madhavrao Chinchole, Student, MCA, MET Institute of Management, Nashik, Maharashtra, India
3Mr. Jayesh Mohanlal Borse, Student, MCA, MET Institute of Management, Nashik, Maharashtra, India Mr. Prafulla Prakash Chaudhari, Assistant Professor, Department, MCA, MET Institute of Management, Nashik, Maharashtra ---------------------------------------------------------------------***--------------------------------------------------------------------4
Abstract - With the rapid evolution of artificial intelligence
MFCC TTS CNN RNN LSTM SVM GMM HMM CREMA-D
(AI) and its integration into mobile technologies, humancomputer interaction has undergone a transformative shift. Speech Emotion Recognition (SER) enables automatic identification of a speaker's emotional state from voice signals, offering significant potential for mental health interventions, personalized feedback, and enhanced user experiences. Accurately detecting emotions such as happiness, sadness, and anger is critical for addressing psychological well-being in real-time applications. While deep learning models have demonstrated high accuracy, traditional machine learning methods, such as Random Forest, provide robust, interpretable, and computationally efficient alternatives, particularly for resource-constrained mobile devices. This paper proposes a novel real-time, mobile-optimized SER system that leverages Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction and a Random Forest Classifier to recognize happiness, sadness, or anger. Upon emotion detection, the system delivers motivational verses from the Bhagavad Gita through a Text-to-Speech (TTS) engine, creating a culturally resonant mentoring experience. Trained on a combined dataset of CREMA-D, RAVDESS, and EmoDB, the model achieves 78–85% accuracy under experimental conditions. By integrating spiritual counseling with AI-driven emotion recognition, this system pioneers a culturally relevant approach to digital mental wellness, offering scalable, accessible, and empathetic support for emotional health in diverse cultural contexts. Furthermore, the incorporation of ancient wisdom like the Bhagavad Gita not only enriches the feedback mechanism but also promotes long-term emotional resilience by encouraging users to reflect on timeless philosophical principles amid modern stressors.
RAVDESS EmoDB API SPA CSS MP3
1. INTRODUCTION The rapid adoption of smartphones and advancements in voice-based technologies have revolutionized humancomputer interaction, enabling seamless and intuitive communication. Real-time detection of a speaker’s emotional state through voice signals has emerged as a critical research area, with applications in mental health monitoring, timely emotional support, and personalized user experiences [1]. Speech Emotion Recognition (SER) is increasingly vital in domains such as virtual assistants, customer support, educational tools, and healthcare systems, where understanding emotional cues can enhance responsiveness and user satisfaction [2]. Real-time emotion detection addresses key aspects of psychological well-being, such as reducing stress, managing anger, and uplifting mood, thereby supporting mental health in an increasingly digital world [3].
Key Words: 1.Speech Emotion Recognition, 2.Random
Forest Classifier, 3.Mel-Frequency Cepstral Coefficients, 4.Bhagavad Gita, 6.Real-Time Systems, 7.Mobile Applications
The societal need for accessible mental health solutions has grown exponentially, particularly in culturally diverse regions where traditional counseling may not be widely available, affordable, or socially accepted due to stigma [14]. In such contexts, integrating spiritual or philosophical guidance, such as verses from the Bhagavad Gita—a sacred Hindu text offering profound insights on duty, self-control,
NOMENCLATURE List of Abbreviations AI SER
© 2025, IRJET
Artificial Intelligence Speech Emotion Recognition
|
Impact Factor value: 8.315
Mel-Frequency Cepstral Coefficients Text-to-Speech Convolutional Neural Network Recurrent Neural Network Long Short-Term Memory Support Vector Machine Gaussian Mixture Model Hidden Markov Model Crowd-sourced Emotional Multimodal Actors Dataset Ryerson Audio-Visual Database of Emotional Speech and Song Berlin Emotional Speech Database Application Programming Interface Single-Page Application Cascading Style Sheets MPEG-1 Audio Layer 3
|
ISO 9001:2008 Certified Journal
|
Page 193