International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN:2395-0072
![]()
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
Volume: 09 Issue: 08 | Aug 2022 www.irjet.net p-ISSN:2395-0072
1Assistant Professor, Department of computer Science & Engineering SIPNA College of Engineering and Technology, Amravati, Maharashtra, India, 2,3,4,5,6 Final year, Department of CSE, SIPNA College of Engineering and Technology, Amravati, Maharashtra, India. ***
Abstract - Predicting the mortality of patients is a difficult task & important problem. Several severity grading methods and machine learning mortality prediction have been developed during the last few decades. The critical care unit treats patients with conditions that are life-threatening (ICU). Success in treatment and death rates in the ICU rely on how well human and technology resources are used. The Deep learning and machine learning based approach is applied to 3999 patients, which generate mortality prediction modelbasedontheirfeatures.Theresultsshowedthatthe factors including duration of hospital stay, clinical state, immobilization, drowsiness, neurological disorders, agitation, coma, intubation, mechanical ventilation, usage of vasopressors, glycemic index, sociodemographic traits, and delirium could be used for mortality prediction with 89percentofaccuracy.Thechanceofdyingappearstobe doubledinhospitalswithextendedICUstays.Insummary, this study offers an enhanced chance of predicting whetherapatientwillliveorpassawaydependingonhow long they remain in the hospital. It also serves as an anchor for the analytical techniques used to forecast mortalityandhospitalstay.
Key Words: ML algorithm, Neural Network, SVM, Logistic regression, Random Forest Classifier, Decision TreeClassifier,XGBoostClassifier,GaussianNB.
An important aspect of planet Earth is Human- life. Multiple reasons account for the illness of an individual whichmayleadtodeath.ICUspecialistworkstocureand treatthisillness.Withthe advancementofinnovation,the odds of survival of a patient have been expanded. One of the biggest emerging technology is Artificial Intelligence. Machine learning. running text should match with the list ofreferencesattheendofthepaper.algorithmscanserve as a better option for the prediction of mortality rate and severityofillness.Doctorsandnursesaskmanyquestions about patients and use specialized instruments like stethoscopes, syringes, portable sensors, printed reports, etc.togatheranydata aboutthem. Thedatasetincludes a variety of characteristics, including heart rate, respiration
rate, glucose level, and if a disease is present or whether any symptoms are present. For prediction, we have consideredatrainingdatasetwhichconsistsof3999rows and 42 columns without an output column. The output columnsareintegratedintoaseparatefilewhichcontains details about whether the patient survives or not. Out of 42featurescolumns,4columnspossessdatatypeasint64 and the remaining 38 features columns are float64 data type.Fromtheoutputcolumnfile,wehaverecognizedthat our problem was “Binary Classification” related. Binary classification is a problem which provides output to be either 0 or 1. With the real-life application, we can consider that 0 specifies surviving whereas 1 specifies death. The remaining paper content is organized in different Sections included is as follows: section II describes related work required, the Methodology followedisdiscussedinSectionIII,SectionIVdescribesall the Experimental Results and last i.e., and section V concludestheoveralloutcomeofthepaper.
Mortality Prediction is a Binary Classification problem. So there have been various approaches to tackle this problem.Somehavechosenneuralnetworkswhileothers chosevariousMachinelearningClassifierstohaveabetter result [2]. There is a list of Machine Learning and deep learning algorithms which can be brought into play to tackle the problem. But we have had maximum accuracy using the ANN model. We have opted Artificial Neural Network Model, which is providing us with better results than other Neural and Machine learning algorithms. Implementation of the model supports the features of the scikit-learn[5]togetonaccuracyscoreimplementations.
In order to solve this dilemma, we must determine if a patient will live or die while receiving treatment at the hospital. A Binary Classification Problem exists. We employed a variety of models to solve the issue and forecasttheoutcome,including:LogisticRegression
2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 823
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN:2395-0072
For training and testing, we have considered a random data set. The Dataset contains 3999 patient entries with 42 features and a multilabel status which describes whether the patient survived or not with two different values as 0 and 1. A snapshot of the dataset used is displayedinAppendix1.
The first important step before applying Algorithms was to explore the dataset and understand the relationship between different features of the dataset and output values. Which in succession helps to drop the features which have minimum or null influence on labels i.e., output. The data set we considered is vacant of missing values. Also, labels and training datasets were given separatelybutinordertovisualizethedataitsnecessityto concatenatebothseparatedfileinacombiningform.
The most important features from dataset are as [ALP’, ’ALT’,’AST’,’Age’,’Albumin’,’BUN’,’Bilirubin’,’Creatinine’, ’DiasABP’,’FiO2’,’GCS’,’Glucose’,’HCO3’,’HR’,’K’,’Lactate’, ’MAP’, ’MechVent’, ’Mg’, ’NIDiasABP’, ’NIMAP’, ’NISysABP’, ’Na’ , ‘PaCO2’, ’PaO2’, ’Platelets’, ’RecordID’, ’RespRate’, ’SaO2’, ’SysABP’,’Temp’, ’TroponinI’, ’TroponinT’, ’Urine’, ’WBC’,’Weight’,’pH’]Forfindingcorrelationwehavehad taken help of HeatMap . This have helped us to figure out that there is minimum correlation of following features [’Gender’,’Cholesterol’,’HCT’,’ICUType’,’Height’] with output so we can consider to drop them. Fig 3 shows correlationoffeatures.
As our dataset contains 42 features its essential to figure outthecorrelationbetweendifferentfeaturesandremove thosefeatureswhichcontainminimumornullcorrelation. Besides, it reduces the size of our features but it's helpful toattainmoreaccuracy.
Inordertotransformall featuresina particularrangewe have opted technique of the Min-Max scalar from sci-kitlearn. These scales and translate each feature, by scaling individuallysuchthatitisintherangeoftrainingsets[5]. Thisfeaturehelpsthemodeltoworkmoreefficientlywith thebiasedcondition.Asnapshotoftheprocesseddataset isdisplayedinFig3.
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN:2395-0072
Neural Network contributes with maximum accuracy of 0.8937.So,fromthisstatistic,wecanconcludethatANNis a best best-trained model for the prediction of ICU mortality. Comparative analysis of accuracies associated withdifferentmodelsisgraphicallyrepresentedinFig.6,7
In an effort to fulfil our needs we have considered trying differentmodels.Wetried7modelsoutofwhich6models were more effective in consideration of the other two models. In order to maximize the result, we tried and tuned 6 models. The table below describes a list of different models we used along with their respective accuracyanddescription.
SNo. Machine Learning Model Score 1. LogisticsRegressionModel 0.8550 2. SupportVectorClassifier 0.8687 3. RandomForestClassifier 0.8501 4. DecisionTreeClassifier 0.8287 5. ArtificialNeuralNetwork 0.8937 6. XGBoostClassifier 0.8575 7. GaussianNB 0.5400
Firstly, we tried with Logistic Regression Model which provided an accuracy of 0.8550. For Support Vector Classifier accuracy is about 0.8687. For Random Forest Classifier accuracy is about 0.8501. For the decision Tree Classifier accuracy was around 0.8287, for Xgboost Classifier we tried to increase our accuracy up to 0.8575. We got the best result which is maximum accuracy with ANNwhichis0.8937.ProcessingofANNalgorithmcanbe describedasshowninFig.4
We tried 7 different models. Out of the used models, SVM is giving 0.8687 as an accuracy value then Xgboost Classifier shows an accuracy of 0.8575 and Artificial
Fig-5 DiagramofArtificialNeuralNetwork
Fig-6 ANNmodelAccuracyGraph
Fig-7 ANNmodelLossGraph
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN:2395-0072
A confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes [6]. It specifically provides a comparativeanalysisofactualvaluesandvaluespredicted by the machinelearning algorithm by providing a view of how our binary classification model is performing and whatkindsoferrorsitismaking.
We tried 7 different models. Out of the used models RandomForestClassifierisgiving0.86875asan accuracy value the Xgboost Classifier shows an accuracy of 0.8725 and the Artificial Neural Network contributes with maximum accuracy of 0.8765. The F-score value for ANN is0.92witha precisionof0.85anditalsoacquiresa high recall value that is 1. So, from this statistic, we can conclude that ANN is a best best-trained model for the predictionofICUmortality.Table3consistsvaluecountof the number of patients dying in ICU that has been predictedbymodelsoftopaccuracies.
● TruePositive(TP)=560positiveclassdatapoints werecorrectlyclassifiedbythemodel
● True Negative (TN) = 330 negative class data pointswerecorrectlyclassifiedbythemodel
● FalsePositive(FP)=60negativeclassdatapoints were incorrectly classified as belonging to the positiveclassbythemodel
● FalseNegative(FN)=50positiveclassdatapoints were incorrectly classified as belonging to the negativeclassbythemodel.
Performance is typically estimated on the basis of synthetic one-dimensional indicators such as precision, recallorf-score[7].For“MortalityRatePrediction”which ismedicalrelateddomainit’simportanttoraisealarmfor actual positive cases as compared to that specifying false cases.Forthispurpose,recall matricesaremoreuseful as compared to that other. Table (02) specifies f1-score, recall and support value for the selected binary classificationalgorithmisANN.
In this paper, we have worked on a random dataset related to ICU in order to predict mortality rates. For the sake of the training dataset, we have considered 7 different models such as (Logistic Regression, Support Vector Classifier, Random Forest Classifier, Decision Tree Classifier, Xgboost Classifier, Gaussian NB) and preprocessed available dataset. Among all the listed model’s best accuracy was provided by ANN which is 0.8937 and theleastaccuracyisprovidedbyGausianNBwhichisnear 0.5400. Based on our analysis ANN will serve as the best optionforthepredictionofmortalityrateinICU.
9. REFERENCES: -
https://www.sciencedirect.com/science/article/a bs/pii/S1386505617303581
https://www.hindawi.com/journals/ccrp/2020/ 1483827/
https://www.stoodnt.com/blog/ann-neuralnetworks-deep-learning-machine-learningartificial-intelligencedifferences/#:~:text=ANN%20is%20a%20group %20of,a%20subfield%20of%20artificial%20intel ligence.n
International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056
5) https://scikit-learn.org/stable/
6) https://www.analyticsvidhya.com/blog/2020/04 /confusion-matrix-machine-learning
7) https://www.researchgate.net/publication/2266 75412_A_Probabilistic_Interpretation_of_Precisio n_Recall_and_FScore_with_Implication_for_Evaluation
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN:2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 827