International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
![]()
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Smt. Akshitha Katkeri1, Nagashree M S2, Shilpa R3, Srilakshmi N4, Srilalitha C S5
1Assistant Professor, VTU, Department of CSE, BNM Institute of Technology, Bangalore, Karnataka,INDIA 2,3,4,5VTU, Department of CSE, BNM Institute of Technology, Bangalore, Karnataka, INDIA ***
Abstract - Heart disease is a build up of fatty plaques in the arteries and calcium outside the major artery. Many techniques have been used for the ailment of this problem by using various algorithms. These manual method of consultation is difficultand time consuming in severe cases. This study proposes an easy method of user system interaction by hybrid approachof several algorithms like logistic regression, Gaussian NB, linear SVC, K Neighbours, Decision Tree and Random Forest. In this hybrid approach the best performed algorithm is used in the final evaluation. Results: For heart disease detection, The Linear SVC model achieved best results with accuracy: 90.78%, precision: 96.87%, sensitivity: 83.78%,F1 score: 89.85%, ROC: 90.60%. Conclusion: This proposed system illustrates the use of interactive system to predict heart disease by using multi feature classification and hybrid approach which has promising results compared the previous studies and methods.
Key Words: Gaussian NB, Linear SVC, Random Forest, K Neighbours, Decision Tree, Random Forest, Arteries.
Heartdiseaseisalsoknownascardiovasculardisease(CVD)whichremainsasthenumberonereasonfordeathrateglobally. TherearevariousCVDdiseases,suchasangina,heartfailure,Coronaryheartdisease,congenitalheartdiseaseandsoon.Nearly, 17.9millionpeoplearelosingtheirliveswhoareattheearlyageof70’sbecauseofthisCVD.Themainriskfactorsofheart diseasenowadaysareduetounhealthydietplans,intakeofalcoholandtobacco,smoking,lackofphysicalactivitiesandstressdue towork.Theeffectsoftheseriskfactorsleadtoraiseinbloodpressure,bloodlipids,overweightandsoon.Theothermain reasonforCVDisbecauseofthebuildingupofcalciuminmajorarteryoutsidetheheartwhichispredictedasfutureheartattack orstroke.Themoreextensivethecalciuminthewallsofbloodvessel,thegreaterwillbetheriskoffutureCVD.
Thereareseveralclassifiersusedtodetectheartdiseasesuchaslogisticregression,GaussianNB,LinearSVC,DecisionTree,K Neighbours,andRandomForest.LogisticRegressionisasupervisedmachinelearningalgorithmthatisused tomodelthe probabilityofacertainclassoranevent.Itisusedwhenthedataislinearlyseparableanditsoutcomeisbinaryinnature.
GaussianNBisagenerativemodel.ItassumesthateachclassfollowsaGaussiandistribution.Itisusedspecificallywhenthe featureshavecontinuousvalues.
LinearSVCistofittothedataprovidedandresultingthebestfithyperplanewhichcategorizesthedata.Aftergettingthehyper plane,somefeaturescanbefedtotheclassifiertocheckwhatthepredictedclassis.
DecisionTreeusesvariousalgorithmstodecidetosplitanodeinto2ormoresub nodes.Asthesub nodesincreasesitspurity alsoincreases.Thedataissplitcontinuouslyaccordingtothespecifiedparameters.
Themaingoalofthisstudyistodevelopahybridmodelofallthealgorithmsthatbestsuitthepredictionandmakethemodel more accurate by people having the knowledge about their health condition much before so that they can have aproper treatmentandgetcuredwithoutanyseriousissues.Thereby,reducingthedeathrategloballyduetoheartdisease.
Theproposedmethodologyaimstopredictweatherthepatientissufferingfromtheheartdiseaseornot.Thisautomation helps doctorstoanalyzethecriticalconditionofthepatients.Henceitalsohelpsinimprovementoftreatments.Patientscantakemany precautionsandhelpstosavemanylives.Inthisproject,weareusingvariousalgorithmsi.e.weareimplementingbyusing hybridtechnologywithmulticlassdataset.Multiclassdatasetrepresentsvariouslevelsi.e.0,1,2,3,4.Themodelmakesuseof severalmachinelearningtechniquesandalgorithmsinanefforttoofferamorepreciseanswertotheproblem.NumerousML techniquesareusedhereonthedataset.Forinstance,theKNearestNeighborsmethod,RandomForest,LogisticRegression, GaussianNB,RegressionTree,etc.Ahybridmodeliscreatedemployingallthesetechniquesforincreasedaccuracy.Additionally, themodelworkswithpracticallyallpatientrecordtypes.Pre processingofthedatasetinvolvesreducingnoiseandoutliers.The datasethasnowbeensplitintotrainandtestdata.Datathatis75%trainedisreferredtoastraindata.Datadeemedtobetest datamakeup25%ofthetotal.Thefigure basedmethodsbelowareusedtogeneratetheMLmodels.
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2425
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Fig 1:FlowChart
Thebelowfigureshowsthedataflowoftheproposedmodel: Fig -2:ProposedSystem
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2426
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Inthisproject,thedatasetistakenfromUCIRepositoryofMachineLearningDatabases.Thisdatasetcontainsatotal of303 recordswith14medicalfeatures.Theoriginalvalues1,2,3,4weretransformedinonethatisthepresenceofheartdisease.All featureshavesomevaluesinthedataset.Itisexplainedinthebelowtable. 1 Age Age Ageofpatientsinyears. 2 Sex Sex 0isforfemales, 1isformales 3 ChestPain Cp 1=typicalangina 2=atypicalangina 3=non anginapain 4=asymptomatic
Sr.no Attribute Attributerepresentation Description
Restingbloodpressure Trestbps Bloodpressure 5 Serumcholesterol Chol MinimumCholesterol:126 MaximumCholesterol:564 6 Fastingbloodsugar Fbs 0=false 1=true 7 Restelectrocardiograph Restecg 0=normal 1=abnormalityofST 2=leftventricularhypertrophy 8 MaxHeartrate Thalach Maximumheartrateachieved 9 Exercise inducedangina Exang 0=no 1=yes 10 STdepression Oldpeak Exerciseinducedangina 0=no 1=yes 11 Slope Slope Slopeofpeakexercise 1=unsloping 2=flat 3=downsloping 12 Noofvessels Ca Major vessels colored (0 3) by fluoroscopy 13 Thalassemia Thal 3=normal 6=fixeddefect 7=reversibledefect
Table 1: DetailedHeartdiseasedatasetattributeswiththedescription
Table 2: Staticstudyofthedatarelatedtoheartdisease
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Theprocedureusedtoefficientlyprepareadatasetforcategorizationisknownasdatapre processing.Theremightbe somemissingvaluesinthereal worlddatathathasbeengatheredandsavedinthedatabase.Thisisthemosttypicalissue becauseeverypatientwouldhaveenteredtheirinformationincorrectly.Thenormalizationoftheattributedatafillsinthe missingvalues.
=
Where �� = mean, �� = standard deviation, �� = single value feature. Utilizinga unit mean and zero variance, the data characteristicsarestandardized.
Thegoaloffeatureextractionistoachievetheaimbyextractingasubsetofnewfeaturesfromtheoriginalsetusingsome functionalmapping.Theextremelysignificantcharacteristicsarechosenforpredictiononcethefeaturesignificancegraphis plottedforfeatureextraction.
Thebelowfigureshowstheattributedistribution.
Chart 2 : VisualizationoftheData
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Followingpre-processing,thedataisseparatedintotraindataandtestdata.Inthisstudy,thehybridmodelisproposed.On thetraindata,avarietyofclassificationtechniquesareusedtotrainthemodel.GaussianNB,LinearSVC,LogisticRegression, DecisionTreeClassifier,RandomForestClassifier,KNN,andSVMarethealgorithmsemployedinthesuggestedmodel.
Atablecalledaconfusionmatrixisusedtodescribehowwellaclassificationsystemperforms.Aconfusionmatrixshows andsumsupaclassificationalgorithm'sperformance.Theconfusionmatrixforeachclassifierisshownbelowinthefigures.The followingisadefinitionofeachentryintheconfusionmatrix:
Thetotal number of accurate findings or hypotheses where the real class was positive is knownas the true positiverate(TP).
Thetotalnumberofinaccuratefindingsorforecastsmadewhiletheactualclasswaspositiveisknownasthe falsepositiverate(FP).
Thetotalnumberofaccuratefindingsorhypotheseswheretheactualclasswasnegativeisknownasthetrue negativerate(TN).
Theamountofincorrectoutcomesorpredictionsmadewhentheactualclasswasnegativeisknownasthefalse negativerate(FN).
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Fig 7. Decision Tree Fig 8. KNN
Fig 9. SVM
Table 3. Performancetableofclassifieralgorithms
Thealgorithmwiththegreatestaccuracyscoreistakenintoconsiderationforthepredictionofthepatient'sheartillnessbased ontheperformancetableofclassifieralgorithms.TheclassifierusedforheartdiseasepredictionusestheLinearSVCmodel sinceitprovidesthebestaccuracycomparedtoallotherclassifiers.
Aresultisthefinalconsequenceofactionsoreventsexpressedqualitativelyorquantitatively.Performanceanalysis isan operationalanalysis,isasetofbasicquantitativerelationshipbetweentheperformancequantities.
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Thisistheloginpagewhereusercanenterintothewebsitebyenteringnameandthepasswordsetbyuseronceregisteredtothe website.Oncethedetailsareentered,itgoesbackcheckswhetherthedetailsgivenwhileregisteringmatch.Ifitmatchesthenit allowstologinintowebsite.
Theabovefigureshowsthesign uppage.Iftheuserisvisitingthewebsiteforthefirsttime,thentheusershouldregisterto thesitebyprovidingthepersonalinformationaskedintheform.TheuserafterfillingthedetailsshouldclickonCreateAccount button.Iftheuserisalreadyregistered,thenthealertisshownthattheuserisalreadyregistered.Iftheuserisregistered,then alreadyhaveone?Optionisclicked.Thebuttontakestheusertologinpage.
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Theabovefigureshowsthedatabasepage.Thedataprovidedbytheuserduringregistrationisstoredinthedatabase.Eachuser has unique username. If the new user registers by providing the username already existing, itshowstheuseralreadyexists promptingusertoinputtheotherusername.Theuserwhilelogging in,hastoenterthecorrectusernameandpassword.Ifthe usernameandpasswordprovidedbytheuserismatchedwiththedatabase,thentheuserissuccessfully.
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Theabovefigureistheuserinputpage.Oncetheuserlogs inintothewebsite,thenheisdirectedtothispage.Userhastoinput thedataaccordingtotheirhealthconditionsbyselectingfromthedropdownoptions.Ageisofintegertype;theuserhastoinput theageinnumbers.TheSexfiledhavetheoptionsmaleandfemale,theuserhastochooseone.TheRestingBloodPressure, SerumCholesterolinmg/dl,MaximumHeartRate,STDepressioninducedaretheintegertype;theuserhastoinputdatafromthe medicalrecordprovided.Theother fieldslikeChestpaintype,FastingBloodSugar,RestingECGResults,ExerciseInduced Angina;theuserselectoneoftheoptionsfromthedropdownmenu.
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Theabovefigureisdirectedfromtheresultspagetodietplanpage.Basedontheresultsoftheuserandhow riskythe diseaseisthesystemprovidesadietplantotheusertomaintainhealthconditions.Byfollowingthedietplanusercanbringhis healthconditionsfromseveretonormal.
Theproposedmethodologyaimstopredictweatherthepatientissufferingfromtheheartdiseaseornot.Thisautomation helpsdoctorstoanalyzethecriticalconditionofthepatients.Henceitalsohelpsinimprovementoftreatments.Auserinterface iscreatedtotaketheinputfromtheuserandthemodelpredictsthepresenceofheartdiseaseandrecommendsdietplans.This isusefulforimprovingtheuser’shealtheffectively.
HeartDiseasePredictionisaverycommonproblemnow.Thisproposeduserinterfaceplatformhelpseveryonetoregister andloginandgiveinthedataandgettoknowtheirhealthstatus.Basedonthegivendataitpredictswethertheheartdiseaseis presentornot.Thisproposedsystemhelpstoidentifydiseaseinaveryearlystagetopreventdeathrate.Adatabaseisalso createdsothatallthepatient’sdatacanbestored.Itisahybridsystemapproachsuccessfullyusedforheartdiseaseprediction withhigheraccuracyrate.
Theabovefigureshowstheoutputpage.Onceusergivestheinputandselecttopredict,itthenredirectstotheresultpageand afterprocessingitgivestheresultwhetherheartdiseaseispresentornot.Thepagealsocontainstheoption“gobacktohomepage” whichtakestheusertohomepage. © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal |
International Research Journal of Engineering and Technology (IRJET) e ISSN: 2395 0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p ISSN: 2395 0072
Infuture,modelfordirectserviceofthepatientsfromtheoldagehomesorotherhomecarecenterstotheIntensiveCare Unit(ICU)throughambulanceservicescanbeplanned.Anartificiallyintelligentsystemwilltakethedataofclinicalparameters
fromoldagehomesorothercarecenters.Themodelgetsthesingleoutputthatwillrevealdistinctstagesofpatientsinterms of healthy, first/second stage of sickness and critical stage. The system willshowgreencolorifthestatus ofthe personis healthy,andtherespectivepersonwillbeinformedviaSMSthatyouare‘Healthy’.Otherwise,ifthepersonisatthefirst/second stageofsickness,thenaSMS‘Dofrequentmonitoring’willbesenttohis/hermobilenumber.
[1] PronabGhosh,SamiAzam,MirjamJonkman,AsifKarim,F.M.JavedMehediShamrat,EvaIgnatious,Shahana Shultana,AbhijithReddyBeeravolu,FrisoDeBoer,”EfficientPredictionofCardiovascularDiseaseUsingMachine Learning Algorithms with relief and LASSO Feature Selection Techniques“, 10.1109/ACCESS.2021.3053759, VOLUME9,2021
[2] TsatsralAmarbayasgalan,Van-HuyPham,NiponTheera-Umpon(SeniorMember,Ieee),YongjunPiaoAndKeun HoRyu(LifeMember,Ieee),“AnEfficientPredictionMethodforCoronaryHeartDiseaseRiskBasedonTwo Deep Neural Networks trained onwell-ordered training dataset”, 10.1109/Access.2021.3116974,Volume9,2021
[3] Norma Latif Fitriyani, Muhammad Syafrudin, Ganjar Alfian, (Member, Ieee), And Jongtae Rhee, “HDPM: An Effective Heart Disease Prediction Model for a Clinical Decision Support System”, 10.1109/ACCESS.2020.3010511,VOLUME8,2020.
[4] SarriaE.A.Ashri,M.M.El-Gayar,AndEmanM.El-Daydamony,“HDPF:HeartDiseasePredictionFrameworkBased onHybridClassifiersandGeneticAlgorithm”,10.1109/ACCESS.2021.3122789,Volume9,2021.
[5] AqsaRahim,GhulamIshaqKhan,YawarRasheed,FarooqueAzam,MuhammadWaseemAnwar,“AnIntegrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases”, 10.1109/ACCESS.2021.3098688,IEEEAccess2021.