International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
![]()
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
Satvik Gurjar1 , Chetna Patil2 , Ritesh Suryawanshi3, Madhura Adadande4 , Ashwin Khore5, Noshir Tarapore6
1,2,3,4,5 LY B. Tech Computer Engineering, Science & Technology, Vishwakarma University, Pune, India – 411048 6Assistant Professor, Dept. of Computer Engineering, Vishwakarma University, Pune, India – 411048 ***
Abstract – Mental health problems are one of the major concerns of the 21st century in the field of healthcare. One of the major reasons behind this problem is lack of awareness among masses. Our aim with this paper is to help people realize that they might be suffering from some kind of mental problem like depression, anxiety, ptsd, insomnia by making them aware of their symptoms using Machine learning. In order to apply the machine learning algorithms, data was collected from individuals of varied ages, professions, sex and lifestyle through surveyformconsistingofquestions, which are often used by psychologists to understand their patient’s problem in detail.
We believe implementation of such a system could help us prevent potential “Mental health epidemic” and give people easy access to diagnosis.
AccordingtoworldhealthorganizationdataIndiahas0.75 psychologist and psychiatrist per 100,000 people, when comparedtoArgentinawhichisaworldtopleaderinthis has106psychologistsper100,000people.Toovercomethis potentialepidemicofmentalillness,thegovernmenthasto takesomestrongandnecessarystepstowardshealthcare, providingasufficientbudgettowardsmentalhealth.
To diagnose a patient’s problem, the doctor may ask the patient to fill out a questionnaire. The nature of these questionscouldbesituationalandobjective.Inourpaperwe aretryingtopredictthefollowingproblems.
Words: MENTAL HEALTH PREDICTION, MACHINE LEARNING ALGORITHMS, DEPRESSION, ANXIETY, PTSD, INSOMNIA.
Mentalhealthproblemsarenotnewtomankind.References tomentalillnesscanbeseenthroughouthistory,asearlyas 5thcenturyBC.Butinthemodernworldtheproblemismore common.Accordingtogovernmentstatisticaldataoutofthe whole population of India, 130 million people could be sufferingfromsomekindofmentalillness.Themainreason behindsuchahugenumberofpeoplesufferingfrommental illness is our crumbled healthcare system along with no adequatesupportfromthegovernmenttowardsthisissue. In India topic of mental health is still considered a taboo that’swhyonly8to10percentpeopleareabletogetsome kindoftreatmentfortheirproblemsandrestgetsunnoticed which could be a possible reason for high suicide rates. Doctorshavefoundoutthatalmost35percentofthepeople whoseekmedicalhelpcouldbesufferingfromdepression, post-traumatic stress disorder (Ptsd), anxiety, insomnia, bipolar disease, etc.Anotherbigfactor thatcontributes to theproblemislackofaffordability.
A large amount of India’s population is living below the poverty line, these people don’t have access to proper shelter, food, water, medication, etc. For them proper treatmentofmentalillnessisstilladistantdream.Evenfor the top 10 percent of the population, treatment is costly.
1) Depression- is a disorder that directly affects the person’s emotions, making it difficult for them to function in daily life. When a person is going through a prolonged sadness and hopelessness it canbediagnosedasdepression.
2) Anxiety- is described as feeling of nervousness along with a sense of excessive worry towards a future scenario. In some serious cases it can also causerapidheartrate,shortnessofbreath.
3) PTSD- post traumatic stress disorder(ptsd) is a psychological disorder characterized by failureto recover after experiencing or witnessing a terrifyingevent.
4) Insomnia- it is a common sleep disorder that disrupts a person’s ability to fall asleep or stay asleeporcausethemtowakeupearlyandnotbe abletogetbacktosleep.
Therehasbeenmanystudiesandresearcheswherepeople have been predicting mental health problems like depression and anxiety using the algorithms of machine learning,likedecisiontree,supportvectormachine,random forestandconvolutionneuralnetworkforthecollectionand classification of data from blog posts. For converting text intomeaningful vectors likeBag-of-words,topic modeling etc. these techniques are used. In some cases, python programminghasalsobeenusedformodelingexperiments, with the best result among all the classifiers [2] being generatedbyCNN withthe accuracyof78percent.Inone study470seamenwerequestionedabouttheiroccupation,
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
socio-economic background and health condition along sixteen other parameters like age, weight, family earning, marital status, etc. Different machine learning algorithms likelogisticregression,naïvebayes,randomforest,Catboost andSVMwereappliedforclassification[7].Ongettingthe resultCatboostshowedthehighestaccuracyandprecisionof 82.6percentand84.1percentrespectively.Sauetal.(2017) manually collected data from the Medical College and HospitalofKolkata,WestBengalon630elderlyindividuals, 520ofwhomwereinspecialcare.Afterapplyingdifferent classificationmethodsBayesianNetwork,logistic,multiple layerperceptron,NaïveBayes,randomforest,randomtree, J48,sequentialrandomoptimization,randomsub-spaceand Kstartheyobservedthatrandomforestproducedthebest accuracyrateof91%and89%amongthetwodatasetsof 110and520people,respectively.Forfeatureselectionand classification,WEKA tool wasusedin[1].Change in heart rate,changeinbloodpressureandacousticsofspeech[8],[3] are some of the symptoms of depression and weak emotionalstate.DiagnosisofPtsdthroughspeechhasbeen doneinrecenttimes.Atypical.Atypicalspeech-basedPTSD diagnostic system consists of three components including dataacquisition,featureextractionandclassification[6].In thedataacquiringstageapatientisaskedquestionsandthe speech dialogue of that patient is recorded. The feature extractioncomponentthenprocessesthespeechdataand extractsfeaturesfortheclassificationcomponenttopredict whetherornotthesubjectbeinginterviewedhasanylevelof PTSD.ThoughothermodalitiessuchasEEG,fMRIandMRI were also studied for PTSD diagnosis [5], [4], the data collection process for these modalities is expensive and cannotmeetthegrowingneed.Speechisnon-invasiveand theinterviewcanbeconductedremotelyvia telephone or recording media so that privacy of the patient is strictly protected, making the speech-based method an ideal diagnostictoolfordiagnosis andtreatmentmonitoring.In January2019researchwaspublishedaboutinsomniabeing predicted through ML algorithms where fourteen parameters were considered. Multiple classification algorithmswereappliedlikeDT,randomforest,etc.among all the models SVM came out to have the best accuracy of 91.634 percent and the f measure score was 92.13. They havefurtherappliedtoadatasetof100patientswherethe SVMcomeswithagoodaccuracyof92%.Theyhavedeclared mobilityproblems,visionproblemsasprimaryfactors[9].
The objective of this research paper is to help people understand about their problems and give doctors an overviewintotheirpatient’spsyche.Allofthiscouldonlybe possiblewhenweusemodelswiththemostaccuracy.
The system goes through multiple stages before the final valuecouldbepredictedaccurately. Thesestagesaredata collection,datapreprocessing,dataencoding, trainingand testing of the algorithm. Once the desired accuracy is obtained,wecanintegratethesystemwithanapplicationfor realworlduse.
To ensure the best possible working of machine learning algorithmsitneedstoworkwithsomekeyparameters.Each andeverytaskrequiresadifferentmodelbasedonthetype ofdataandworkisbeingdealtwith.Hence,itiscrucialto adjust the model’s parameters to increase its utility and accuracy.Inourworkwehavetriedtoensuretotuneallthe modelswithadequateparametervaluesandplumpforthe foremostvalueforourmodels.
Once the rightparametersareselected, wemovetowards applying machine learning algorithms on our collected datasetofdepression,anxiety,Ptsd,insomnia.Thecollected datasetisusuallysplitintotwosubsetsnamelytrainingand testing.Itisdonetoavoidoverfitting.Inanidealsituation thetrainingandtestingdatasetissplitintheratioof80:20
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
i.e.,80percentofitgoesfortrainingthemodelandtherest 20percentisusedtotesttheaccuracyofthemodel.Through researchwehaveselectedthefollowingmachinelearning algorithmstofindthebestpossiblealgorithmthatcouldgive usthemostaccuracy.
A)Randomforest(RF):Itisanalgorithmthatcomesunder supervised form of learning. The working principle is to createmultipledecisiontreesandallofthemarecombined togetprecisepredictions.Hence,itisconsideredapopular machinelearningalgorithm.
B) Decision tree (DT): A decision tree comes under supervisedlearningalgorithmswheredataiscontinuously split according to the parameter. The tree consists of two thingsi.e.,decision nodesandleaves.Decision nodeisthe stage where data is split and all the choices made are the leaves.
C) Logistic regression (LR): Is also a part of supervised learningalgorithmsgroupusedforsolvingtheclassification problem. Logistic regression model works with binary variables like 0 and 1, yes and no, etc. It uses sigmoid functionorlogisticfunctionwhichisacomplexcostfunction.
D)Supportvectormachine(SVM):isaprominentalgorithm usedforbothregressionandclassification.Thegoalofthe SVMalgorithmistocreatethebestlineordecisionboundary thatcansegregaten-dimensionalspaceintoclassessothat wecaneasilyputthenewdatapointinthecorrectcategory in the future. This best decision boundary is called a hyperplane. SVMchoosestheextreme points/vectorsthat help in creating the hyperplane. These extreme cases are calledsupportvectors,andhencethealgorithmistermedas SupportVectorMachine.
E)K-nearestneighbor(KNN):Alsoknownasalazyornonparametric algorithm. The algorithm is actually based on featuresimilarity. Thepredictionis done accordingtothe calculationofthenearestdatapoints.Asitstoresallofthe training data, it can be computationally expensive when workingonalargedataset.
F) Naive bayes (NB): It is aclassifier which is based upon conditionalprobabilitymodels.Theseclassifiersareasetof classificationalgorithmsthatarebasedonBayesTheorem. It’s a group of algorithms where a common principle is shared between them. In our study, we have applied GaussianNaïveBayes.
The initial step is data collection. We have tried to collect data fromdifferentplaces.Therewasnostandarddataset availablewhichcouldmatchourrequirements.Hence, we hadtocollectallthedataourselves.Wemadeasurveyform foreachdiseaseanddistributed,bothonlineandofflinefor peopletofill it.Thenature ofourquestionswasobjective andsituational.Wealsoincludedpeoplewhoarecurrently suffering from some kind of mental illness and are seeing doctorsforitandtakingsomekindofmedications.Oncethe data collection is done, the user's response is converted usingnumericvaluesof0to3,andinsomecases0to4.Once we had enough data collected, it was moved to preprocessingandissplitintotwosubsetsi.e.,trainingand testdatasets.
Itisimportanttofilloutthemissingvaluesinthedatasetor modify it to increase the quality of the dataset. Once the preprocessingofdataiscompleted,itthenmovedtofeature extractionthenceforthpredictionofmentalillness.
Fig -5.1:DatasetOverview
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
Inordertoputourworkinrealworldusewehavedeployed ourworkonwebapplications.Inourapplicationuserscan takeatestforwhicheverdiseaseoutofthefourtheywant, basedontheinputsreceived,ourmodelpredictstheseverity oftheproblemtheyarefacing.
Fig -6.3:ResultPage
Fig -6.1: Data Flow Diagram
Inordertoachievehighaccuracywiththemodel thedata needs to be properly cleaned and preprocessed until it is wellfitted.TodothisweusedpythonlibrarieslikeNumPy, pandasandmatplotlib.Inordertogetthebestresultforour workwehadtopasseachofourdatasetsthroughmultiple MLalgorithmslikelogisticregression,SVM,randomforest, k-neighbors etc.Example: - foranxiety,weran theabovementioned algorithms and achieved accuracy of 97.27%, 94%,81%,80%etc.respectively.Samewasthecaseforthe otherthreediseaseswhichhaddifferentlevelsofaccuracy. Foroursystemwechose thealgorithmwhichgaveus the true and highest accuracy. We also tried to finetune the hyperparametertocheckiftheaccuracycouldbeincreased more.
Fig -7.1: Model with Highest Accuracy
Fig -6.2:TestPage
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN: 2395-0072
[7] Sau,A.,Bhakta,I.(2018)"Screeningofanxietyand depression among the seafarers using machine learning technology."Informatics in Medicine Unlocked:100149.
[8] S. R. Krothapalli and S. G. Koolagudi. Characterizationandrecognitionofemotionsfrom speech using excitation source information. Int. J. SpeechTechnol.,16(2):181-201,2012.
[9] R. Ahuja, V. Vivek, M. Chandna, S. Virmani and A. Banga, "Comparative Study of Various Machine Learning Algorithms for Prediction of Insomnia", 2019.
[10] Y. Kaneita et al., "Insomnia Among Japanese Adolescents:ANationwideRepresentativeSurvey", Sleep,vol.29,no.12,pp.1543-1550,2006.
Fig -7.2: Confusion Matrix
Webelievewewereabletoachieveagoodaccuracyforeach ofthefourdiseases.furthermore,infuturewecanaddmore disease and combine multiple method along with questionnaire to make this process more robust and stronger.
[1] Sau, A., Bhakta, I. (2017)"Predicting anxiety and depression in elderly patients using machine learning technology. “Healthcare Technology Letters4(6):238-43.
[2] Tyshchenko, Y. (2018)"Depression and anxiety detectionfromblogpostsdata."NaturePrecis.Sci., Inst.Comput.Sci.,Univ.Tartu,Tartu,Estonia.
[3] R.A.Calvo and S. D’Mello. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Trans. Affective. Comput., 1(1):18-37,2010.
[4] Q.Zhang,Q.Wu,H.Zu,L.He,H.Huang,J.Zhangand W.Zhang.Multimodal MRI-BasedClassificationof TraumaSurvivorswithandwithoutPost-Traumatic StressDisorder.FrontiersinNeuroscience,2016.
[5] X. Zhuang, V. Rozgic, M. Crystal and B. P. Marx. ImprovingSpeechBasedPTSDDetectionviaMultiViewLearning.IEEESpokenLanguageTechnology Workshop.260-265,2014.
[6] B. Knoth, D. Vergyri, E. Shriberg, V. Mitra, M. Mclaren, A. Kathol, C. Richey and M. Graciarena. Systemsforspeech-basedassessmentofapatient’s state-of-mind.WO2016028495A1.2015.
[11] P. Singh, "Insomnia: A sleep disorder: Its causes, symptomsandtreatments",InternationalJournalof MedicalandHealthResearch,vol.2,no.10,pp.3741,2016.
[12] Sarah Graham, Colin Depp, Ellen E Lee, Camille Nebeker, Xin Tu, Ho-Cheol Kim, and Dilip V Jeste. Artificialintelligenceformentalhealthandmental illnesses:anoverview.Currentpsychiatryreports, 21(11):1–18,2019.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal