“Detection of Diseases using Machine Learning”

Page 1

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

“Detection of Diseases using Machine Learning”

1Department of Electronics Engineering, K.J.Somaiya Institute of Engineering and Information, Technology Mumbai, India ***

Abstract The world's growing population has put enormous pressure on the healthcare sector to offer high quality treatment and accommodations. Artificial intelligence and Machine Learning are no exceptions in the healthcare industry, which has long been a vigorous adherent of cutting edge technology. We have developed a web application using flask framework. It consists of web pages designed for different functionality. It is a disease prediction system which can be deployed on any network for communication among ecumenical users. Report is generated that can be subsidiary in-order to keep records. The report generated can be downloaded locally on user’s device as well as provided on user’s personal emails.

Keywords prediction, MySQL, python, session, admin, machine learning,

I. INTRODUCTION

World is suffering because of Many Diseases and to surmount it we can utilize artificial intelligence and machine learning. The purport of artificial intelligence and machine learning is to make the machine more prosperous, efficient, preciseandreliable thanafore.However,ina healthcaresystem,itwill definitelyavail medicosa lotincriticalsituations and decisions. To minimize the pressure of the healthcare system and to avail the medicos and society we have engendered the project which will predict the particular disease, detect the disease in earlier stage and predict the diseasesvery accuratelyandefficiently. This will avail medicos to attest orcross check their postulationand analysis.It willavailthemincriticalsituationsanddecisions.Theinterfacehasanavigation bar drivenprogrammethatenablesfacile utilizerinteractionwithsomeGUIapplications.LoginandSignupformsareacomponentofuserauthentication.Alldetails acquired during signup process are stored in the database which can only be accessed by Admin. Admin here is the one whomanagesthewebsiteandworksinthebackendmaintainingallthedataextractedduringtheuser’ssession.Sessions areengenderedthatavailsmaintainusersstateanddata all overtheapplication.Differentsectionssuchascontact,FAQ, feedbackandanalysisarepresentonthewebpage.Especiallytheanalysissectionprovidestransparencytotheusersabout howourmodelworksinthebackendashealthcareisaconsequentialissueandjustcan’tbeignoredfor.

II. LITERATURE SURVEY

A. UCI researchers create model to calculate COVID 19 health outcomes

University of California, Irvine health sciences researchers have engendered a machine learning model to predictthe probabilitythataCOVID 19patientwillrequireaventilatororICUcare.Theimplementisfreeandavailableonlineforany healthcareorganizationtoutilize.

"The goal is to give an earlier alert to clinicians to identify patients who may be vulnerably susceptible at the onset," verbally expressed Daniel S. Chow, an assistant pedagogia in residence in radiological sciences and first author of the study,publishedinPLOSONE.Theimplementpredictionswhetherapatient'sconditionwillworsenwithin72hours.

Coupledwithdecision makingconcretetothehealthcaresettinginwhichtheimplementisutilized,themodelutilizesa patient'smedicalhistorytodeterminewhocanbesenthomeandwhowillrequirecriticalcare.ThestudyfoundthatatUCI Health,theimplement'spredictionswerepreciseabout95percentofthetime.

B. Disease Prediction Using Machine Learning

Computerized systems are currently considered to be much more efficient than the traditional ones, similarly adapting these systems in the healthcare sector would yield better results comparatively. The concept of supervised machine learning algorithms holds enormous potential for disease diagnosis. Huge amount of data is required in such systemsinordertogainhighprecisionoutput.Therearemanytypesofalgorithmsavailable,selectionofthesealgorithms

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2672
2

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

isverycrucialatthetimeofdesigningthemachinelearningmodel.Inthisliterature,theaimistoapperceivetrendsacross varioustypesofsupervisedMLmodelsindiseasedetectionthroughtheexaminationofperformancemetrics.

There are some algorithms such as Naves Bayes (NB), Decision Trees (DT), And K Nearest Neighbor (KNN) etc. is consideredtobemostprominentamongothers.AccordingtotheresearchSupportVectorMachine(SVM)wasfoundtobe most eligible at detecting Kidney and Parkinson’s diseases. Similarly Logistic Regression (LR) for heart disease, Random Forest Classifier (RFC) and Convolutional Neural Networks (CNN) for breast and common diseases were selected respectively.

III. PROPOSED SYSTEM

ProposedDiseasespredictionSystemtoImprovequalityandefficiencyofcurrenthealthcaresystem."Itisintroduced with the aid of web creation. In order to prevent pitfalls in the current system, the proposed system is built as a web basedsystemwhereitcanbeaccessedatanytimeandanywhereontheirmobile/PC.Theusercanusetheircredentialto accesstheserversystem.Thewebsitefacultieswillnowkeeptrackofuser’shealthbyprovidinginsightsofhealthissuesif any.Additionalfeaturesincludegenerationofreport.

Figure1showstheflowchartofproposedweb basedsystem.Aloginpageisawebpagethatneedsuseridentification andauthenticationbyenteringausernameandpasswordcombinationonaregularbasis.Loginscanprovideaccesstothe entirewebsite.Logginginnotonlygivestheuseraccesstothesite,butalsoenablesthewebsitetomonitoruseractivities and behaviour. Logging off a website or site may be a manual for the user or may occur automatically when such circumstances(suchaspageclosure,deviceshutdown,longdelay,etc.)occur.

A. Diseases Parameters

Everydiseasehasdifferentparametersonthebasisofwhichitisdetected.Belowarefewmostsignificantparameters tabulatedintable1afterdatapre processing.

TableI.Mostprominentparametersfordiseasedetection

Disease Parameters Used in(%)

Diabetes BP(mmHg), Glucose,Insulin (muU/ml),BMI (kg/m2),Diabetes PedigreeFunction, Age

Heart Cholesterol, FastingBlood Sugar,ChestPain type

Liver Proteins, Albumin, Bilirubin, Albumin andGlobulinRatio

Kidney Sugar,RedBlood Cells,BloodUrea, HyperTension

Parkinson’s rangeof biomedicalvoice measurements (Hz)

Breast Radius,Perimeter, Area,Concavity, ConcavePoints

RandomForest 82

Logistic 80

RandomForest 79

RandomForest 99

RandomForest 94

Logistic 97

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2673

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

Units

Thenumberofcyclespersecondiscalledfrequency.TheSIunitforfrequencyisthehertz(Hz). 

Amillimetreofmercuryisamanometricunitofpressure,describedastheextrapressuregeneratedbyacolumn ofmercuryonemillimetreshigh,definedasexactly133.322387415pascals.ItisdenotedmmHg. 

Bbodymassindexdefined bythesquareofthe bodyheight,andisexpressedinunitsofkg/m2,formulatedfrom massinkilogramsandheightinmeters. 

Alitreisameasureofvolumethatisabitgreaterthanaquart. 

Amillimeterabbreviatedasmmisasmallunitofdisplacementinthemetricsystem. 

Allthe parametersutilizedinParkinson’sdiseases are the measure of voice in Hertz and db.(decibel) of healthy personandsomeofunhealthyperson.

C. Developing Tools

Thedevelopingtoolsareutilizedforwebpageplananddatabasebuilding.Tobeginwith,theinternetpagesofDiseases PredictionFrameworkwereoutlinedbytheHTML5,CSS3,JQueryandJavaScript,becausethesoftwareissimpletoinduce andsimpletouse.WehaveutilizedPYTHONfortheprogramming,sincetheinternetpagedesignedbyitismoreproficient in preparing the complex working environment providing vast range of libraries. Flask is a web application framework writteninPython.

Theapplicationcanbedeployedondesktopswithanyoperatingsystems.Aboutthedatabaseapparatuses,theMySQL serverisused.

Several algorithms were implemented, Figure 10 shows comparison between them on basis of accuracy. Hence consideringaccuracytwomostprominentalgorithmswereshortlistedanddeployedmentionedbelow:

RandomForestClassifier:

It's a type of supervised learning algorithm that focuses on ensemble technique. Multiple decision trees make a forest and are trained with the “bagging” method. Combining various learning models increases the accuracy relatively. Decisionsareprovidedindividuallybythetreesandthemostvotedonebecomesthepredictionofourmodels.

LogisticRegression:

Another most popular supervised learning algorithm is the Logistic Regression. It utilizes the famous sigmoid function providing a S shaped graph. It basically works similar to that of mathematical probability resulting in classificationofcategoricaldata.

IV. IMPLEMENTATION OF PROPOSEDPROJECT

Fig.2.AuthenticationLoginform

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2674

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

Figure2showsthediseasespredictionsystemloginpageanditispassword protected.Auniqueemailandpassword havetobeentered.Ifthecorrectemailandpasswordareentered,itwillleadtoaHomePageofthewebsite.

Fig.3.SignUpForm

Figure 3 shows the diseases prediction system Sign Up page in which the user needs to fill its credentials which is stored in a database for further processes. It contains fields such as name, email, contact, blood group, date of birth and password.

Fig.4.Homepage

Topnavigationbarconsistsoflinkstoeachsectionpresentonthewebpageprovidingbetteruserexperienceasshown infigure4.Italsoconsistsofaprofessionalwelcometotheuserloggedin.

Sessionsarecreatedforeveryuserinordertomaintaindataofaspecificuserthroughoutthewebapplication.

Fig.6.Diseasessection

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2675

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056

Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

In Figure 6 these are the various diseases that are available for the users for check up using their medical data. It consistsofkidneydisease,liverdisease,breastcancer,diabetes,heartdiseaseandParkinson’sdisease.

Fig.9.Diseasepredictionpage

Figure9showsadiseasepredictionpage.Eachdiseasehastheirseparatepredictionpage.Assoonastheuserclickson thediseasethathe/sheprefertocheckwilllandonthispagewheretheuserneedtoenterthemedicaldataandclickon predictbutton.Theoutputgeneratedwillbedisplayedonthesamepageitself.

Fig.10.Algorithmcomparison

Algorithmsandtheiranalysisareprovidedtotheuserasshowninfigure10.

V. CONCLUSION

The overall aim is to define various data mining techniques utilizable in efficacious disease prediction. Efficient andprecise predictionwith a lessernumberofattributesandtestsisourprimarygoal.Inthisstudy.Thedata were pre processedandthenutilizedinthemodel.Wefoundtheprecisionafterimplementingalgorithmstobeabove70percent.

Another crucial goal we are looking forward to is to soothsay the disease in its early stage which affects the patientssalubrityalot.Asweallmightbeawareofthefactthatearlystagesofanydiseasesaretootoughtodetectand if detectedwillopengatesforlotsoftreatmentprocesses.

Weall mighthavewonderedutilizingonlinealgorithmforthesesensitiveissuesisnotreliable,wecanverbalize that it's partly veritable as it is a machine and we cannot plenarily rely on its prediction, but still if it provides you with evenaslightestinsightofyourhealththatmightgetworseinfutureifignoredthanitslikelypropitiousforus.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2676

International Research Journal of Engineering and Technology (IRJET) e ISSN:2395 0056 Volume: 09 Issue: 06 | June 2022 www.irjet.net p ISSN:2395 0072

REFERENCES

[1] Daniel S. Chow, Justin Glavis Bloom, Jennifer E. Soun, Brent Weinberg, Alpesh N. Amin, Peter D. Chang. DevelopmentandexternalvalidationofaprognosticimplementforCOVID 19criticaldisease.PLOSONE,2020; 15(12):e0242953DOI:10.1371/journal.pone.0242953.

[2] Ferjani,Marouane.(2020).DiseasePredictionUsingMachineLearning.10.13140/RG.2.2.18279.47521.

[3] M.Amrane,S.Oukid,I.GagaouaandT.Ensarİ,"Breastcancerclassificationusingmachinelearning,"2018Electric Electronics, Computer Science, Biomedical Engineerings' Meeting (EBBT), 2018, pp. 1 4, doi: 10.1109/EBBT.2018.8391453.

[4] A.C.Lyngdoh,N.A.ChoudhuryandS.Moulik,"DiabetesDiseasePredictionUsingMachineLearningAlgorithms," 2020 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES), 2021, pp. 517 521, doi: 10.1109/IECBES48179.2021.9398759.

[5] A.Grover,A.KalaniandS.K.Dubey,"AnalyticalApproachtowardsPredictionofDiseasesUsingMachineLearning Algorithms,"202010thInternationalConferenceonCloudComputing,DataScience&Engineering(Confluence), 2020,pp.793 797,doi:10.1109/Confluence47617.2020.9058120.

[6] S. Ganiger and K. M. M. Rajashekharaiah, "Chronic Diseases Diagnosis using Machine Learning," 2018 International ConferenceonCircuitsandSystemsinDigitalEnterpriseTechnology(ICCSDET),2018,pp.1 6,doi: 10.1109/ICCSDET.2018.8821235.

[7] Goel,Rati,HeartDiseasePredictionUsingVariousAlgorithmsofMachineLearning(July12,2021).Proceedingsof the International Conference on Innovative Computing & Communication (ICICC) 2021, Available at SSRN: https://ssrn.com/abstract=3884968 or http://dx.doi.org/10.2139/ssrn.3884968.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page2677

Turn static files into dynamic content formats.

Create a flipbook