Sepsis Prediction Using Machine Learning

Page 1

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

Sepsis Prediction Using Machine Learning

4Assistant Professor, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India 1,2,3B. Tech Scholars, Department of Computer Science and Engineering, SNIST, Hyderabad-501301, India***

Abstract:- Sepsis is a blood poisoning condition that can increase the mortality risk in ICU patients when the body exhibits a dysregulated host response to an infection and results in organ failure or tissue damage.The expense of treating sepsis in hospitals is rising yearly as it develops into a serious health issue. Different techniques have been developed to monitor sepsis electronically, however in order to reduce the risk of death, it is crucial to forecast sepsis as soon as feasible before clinical reports or conventional techniques. The primary characteristics influencing the classifier's predictions have been outlined, making the model easier for medical professionals to understand. MLP Classifier has been used for the early diagnosis of sepsis, particularly in ICU patients been applied.This study demonstrates how machine learning algorithms, employing six vital signs taken from patient records over the age of 18, can reliably predict sepsis at the time of a patient's admittance into the intensive care unit. Sepsis may be predicted early, which can assist doctors administer supportive care and save mortality and medical costs. Unprecedented assessment measures have been obtained, and they can be very helpful in accurately and promptly predicting sepsis.

I. INTRODUCTION

Sepsisisalife-threateningconditionthatoccurswhenthe body's response to an infection leads to inflammation andtissue damage throughout the body. It is a leading causeof death in hospitals and can progress rapidly if not properly diagnosed and treated. Early detection of sepsisis crucial for improving patient outcomes, and machine learning techniques have the potential to significantly improve the accuracy and speed of sepsis diagnosis. One approach to sepsis detection using machinelearningistousepatientdata,suchasvitalsigns and laboratory results, to train a model to predict the likelihoodofsepsis.

This data can be collected from electronic health recordsor other sources and may include demographic information, previous medical history, and current

symptoms. The model can then be used to identify patients who are at high risk for sepsis, allowing healthcare providers to initiate early treatment and potentiallypreventtheprogression ofthecondition.

Utilizing machine learning as a different strategy for analysis. patterns in patient data and identify early warning signs of sepsis. This can be done by analysing trends in vitalsigns over time or bylooking for changes inbiomarkersthatareindicativeofsepsis.Byidentifying these early warning signs, healthcare providers can intervene before the condition becomes severe and potentially save lives. Overall, the use of machine learning in sepsis detection has the potential to significantlyimprovepatientoutcomesbyenablingearly diagnosis and treatment. But it's crucial to pay close attention to the ethical implications of using machine learning inhealthcare and ensure that the technology is usedresponsiblyandinawaythatbenefitspatients.

The development of artificial intelligence technologies has made it possible to diagnose sepsis early on. These techniques have been created to study and anticipate the health of the human body and acquire accurate prescription information to support doctors in making rapid and effective decisions. They integrate electronic medical records, medical imaging, pathophysiology, and otherdata.

II. \LITERATURE SURVEY

In several medical specialties, an AI-based diagnostic systemhasbeenproventobeefficient.Inthedomainof sepsis diagnosis, prognosis, and therapy machine learning algorithms used include supervised learning and reinforcementlearning.Forexample,Becketal.develop the C-Path (Computational Pathologist) system to automaticallydiagnosebreastcanceranddeterminethe likelihood that patientswillsurviveby lookingatbreast tissueimaging.

Theprimarytwodifficultiesinthecurrentstudyinvolve the use of different physiological indicators and modelling efficient machine learning algorithms for the

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1245
1Mohammad Ateeq, 2 Vineet K Joshi, 3D Naga Praneeth, 4Gugulothu Ravi

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

diagnosis,prognosis,andtreatmentofsepsis.Similarly, Additionallyimportant forpredicting sepsis in advance is choose appropriate variables and design valuable algorithms in the clinical setting. The model's input variablesarephysiologicalindications,andthemodel's output parameter is the patient's condition would developsepsis many hours later. In particular, input data includevital indicators such as heart rate, oxygen saturation, and body temperature; biomarkers such as procalcitonin andinterleukin-6; laboratory values such asbicarbonateandcreatinine;anddemographicfactors suchasgenderandage.InMostofthecategories,alarge number of missingvalues, such as those in MIMIC III (Intensive CareMedical Information Market Database), which has been utilised in several research. Most studies omit variable with a large number of missing values from predictors, resulting in the loss of useful information. To fill in missing information, some research utilize imputation and mean filling methods, although this might lead to selection bias or confounding factor mixes. The data preparation approach must be examined in light of the features of variousdatasets.

The machine learning algorithms generally include support vector machines, gradient boosting trees, random forests, Logistic regression, and neural networks. Amongthem, MLP Classifier have shown good performance. The A model with improved prediction capabilities will be examined further rand improved results for clinical service. so, that Early sepsis choices canbebettermadebyphysiciandiagnosis.

Theresearchhaveperformed wellin the area of sepsis prediction.Thequantityofdatautilisedinthesestudies is, however, reduced because the majority of the missing values are handled by direct deletion or forward filling, and the model's explanatory power is therefore constrained. The following arguments in detail explain why it is difficult to implement these techniquesinclinicalsettings.Acomprehensivedataset is lacking. Researchers make use of information from various patient groups, such as the MIMIC public database or other unbiased sources of hospital data. They choose different clinical factors to create their models,andthesizeofthedataalsovariesconsiderably. Different clinical criteria for sepsis and assessment indicators are used as prediction settings' premise and indicators.

III. METHODOLOGY

1.DATA SET:-Datasetcontainsdataof36thousand patients.Eachpatientisrepresentedby41features.

Fig1.DataSet

2.FEATURE SELECTION:- In the mean processing method, 41 variables were determined to participate in the training model, including (a) vital signs indicators (HR,O2Sat,Temp,SBP,MAP,DBP,Resp),(b)laboratory variables (HCO3, pH, PaCO2, AST, BUN, AlkalinePhos, Chloride, Creatinine, Lactate, Magnesium, Potassium, Bilirubin_total, PTT, WBC, Fibrinogen, Platelets), and (c)demographicindicators(Age,Gender).

Fig2.VitalSigns(a)

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1246

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

Fig3.LaboratoryVariables(b)

Thevariablesthathadmissingproportionsofgreaterthan 98% were eliminated. The demographic metrics HospAdmTime(theintervalbetweenhospitalisationand ICU) and ICULOS (the interval between ICU hospitalizations) have been removed. HospAdmTime displays various numerical levels depending on the health of various patients, which may be connected to sepsis's extended incubation period. Since the primary goal of this studyistodevelopguidelinesforpredictingearlysepsis from changes in certain physiological data, these variablesareomitted.Accordingtothestatisticsentered, patientswithsepsishaveasignificantfatalityrate.They frequently require prolonged ICU care, and the ICULOS value is typically excessively high. Contrarily, patients without sepsis typically get treatment in the ICU for a brief period of time before being discharged after their healthhasimproved,resultinginalowICULOSrating.

The variable ICULOS is eliminated because the variation inICULOSvalueiscausedbythedifferent natureofthe sicknesssituation,whichisagainstthecausalsequenceof earlysepsisanticipatedfromphysiologicaldata.

Fig4:DemographicsIndicator(c)

3.METHOD TO PREDICT SEPSIS:- MLPsareneural

network models thatwork as universal approximators, i.e., they can approximate any continuous function, MLPs arecomposed of neuronscalled perceptions.a perceptron receives n featuresasinput(x = x1, x2,…, xn),andeachof these features is associated to a weight. Input features must be numeric. So, nonnumeric input features have to be converted to numeric ones in order to use a perceptron.

The result of this computation is then passed onto anactivationfunction f,whichwillproducetheoutputof the perceptron. In the original perceptron, the activationfunctionisastepfunction:

Thus, we can see that the perceptron determines whether w1x1 + w2x2 + ⋯ + wnxn − θ > 0 is true or false. The equation w1x1 + w2x2 + + wnxn − θ = 0 is theequationofahyperplane.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1247

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

3.1 IMPROVEMENT OF PREPROCESSING

WARNING PERIODS:- The 6-hour warning period for each patient is immediately integrated to produce a single observationin the mean processing approach discussed above, however the model's performance in terms of prediction may not be sufficient. Further research is being done to determine whether or not higher performance results from segmentation time windows that are finer ordenser. In order to calculate the mean vector, the 6hour warning period is split into 2- or 3-hour time windows, andthemeanprocessingprocedureforthe safe period and illness period is left alone. Figure 5displays furtherinformation.Theimprovementiscomparedwith the original models of their generalisation capabilities basedon MLP’s Classifier and new datasets for training modelsarecreatedinthesamemanner.

Fig6:FeatureImportance

IV. UML DIAGRAM

Asequencediagram(UML)isavisualrepresentationof the flow of messages between objects during an interaction.Agroupofobjectsconnectedbylifelinesand the messages they exchange throughout the course of aninteractionmakeupasequencediagram.

Fig5.MLPMeanCalculator

FEATURE IMPORTANCE:- For the feature importancescore,wetaketheMLPClassifieralgorithm in the mean processing method as an example; the top 10variables with feature importance scores are Temp, O2Sat, Resp, HR, Age, SBP, MAP, PTT, PaCO2, and Potassium as shown in Figure 6. This means that these variables play an important role in predicting the risk ofsepsis.

4.

Fig7.SequentialDiagram

V. RESULTS

1.MODEL PERFORMANCE:- In the mean processing method(method1),theMLPClassifieralgorithmsdiffer inperformance.TheMLP’salgorithmhasaLoglossrate of0.15,withbetterdistinctionperformancebetween01 categories.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1248

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

The log-loss shows how accurate the forecast was the likelihood corresponds to the matching real or true value

(0 or 1 in case of binary classification). The more the predicted the further the probability deviates from theactualvaluethelog-lossvalue.

TheAccuracyscoreiscalculatedbydividingthenumber ofcorrectpredictionsbythetotalpredictionnumber.

2.MLP ALGORITHM:-This model optimizes the log-loss functionusingLBFGSorstochasticgradientdescent.

class sklearn.neural_network.MLPClassifi er(hidden_layer_sizes=(100,), activation='relu', *, sol ver='adam', alpha=0.0001, batch_size='auto', learni ng_rate='constant', learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=N one, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_ stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e- 08, n_iter_no_change=10, max_fun=15000)

Trainingofvariablesinalgorithmforclassification.

VI.CONCLUSION

Fig8.Accuracy

Fig9.Logloss

Itispossibletousemachinelearningtechniquesforthe detection of sepsis, a serious and potentially life threatening condition that can arise as a complication ofinfection. Sepsis is a complex and dynamic process thatcanbedifficulttodiagnose,andearlyidentification and treatment are critical for improving patient outcomes. Several machine learning techniques exist that have been explored for the detection of sepsis, includingsupervisedlearning methods such as decision tree algorithms and support vector machines, as well as unsupervised learning methods such as clustering and anomalydetection.

One potential advantage of using machine learning for sepsis detection is the ability to analyse and interpret large amounts of patient data, including electronic health records, laboratory results, and vital signs, in order to identify patterns and correlations that may be indicativeofsepsis.

© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page1252

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume: 09 Issue: 12 | Dec 2022 www.irjet.net p-ISSN:2395-0072

Overall, the use of machine learning for sepsis detectionhas the potential to improve the accuracy and timelinessof sepsis diagnosis, which can help to improve patient outcomes and reduce healthcare costs. However, more research is needed to fully understand the effectiveness and limitations of these approaches and to optimize theirperformance in realworldsettings.

VII.REFERENCES

[1] K. E. Rudd, S. C. Johnson, K. M. Agesa et al., “Global, regional, and national sepsis incidence andmortality,1990–2017:analysisfortheglobal burden of disease study,” The Lancet, vol. 395, no.10219,pp.200–211,2020.

[2] L. Su, Z. Xu, F. Chang et al., “Early prediction of mortality, severity, and length of stay in the intensive care unit of sepsis patients based on sepsis 3.0 by machine learningmodels,” Frontiers in Medicine,vol.8,883pages,2021.

[3] K. C. Yuan, L. W. Tsai, K. H. Lee et al., “The development an artificial intelligence algorithm for early sepsis diagnosis in the intensive care unit,” International Journal of Medical Informatics,vol.141,ArticleID104176, 2020.

[4] J.E.García-Gallo,N.J.Fonseca-Ruiz,L.A.Celi,and J. F. Duitama-Muñoz, “A machine learningbased model for 1 year mortality prediction in patientsadmittedtoanintensivecareunitwith adiagnosisofsepsis,” Medicina Intensiva,vol.44, no.3,pp.160–170,2020.

[5] J. Kim, H. Chang, D. Kim, D. H. Jang, I. Park, and K. Kim, “Machine learning for prediction of septic shock at initial triage in emergency department,” Journal of Critical Care, vol. 55, pp.163–170,2020.

[6] A. H. Beck, A. R. Sangoi, S. Leung et al., “Systematic analysis of breast cancer morphology uncovers stromal features associated with survival,” Science Translational Medicine,vol.3,no.108,108ra113pages,2011.

[7] D. J. Stekhoven and P. Bühlmann, “MissForest non-parametric missing value imputation for mixed-type data,” Bioinformatics, vol. 28, no. 1, pp.112–118,2012.

[8] R Core Team, R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing, R Core Team, Vienna, Austria,2014.

[9] J. C. Gower, “A general coefficient of similarity and someofits properties,” Biometrics, vol.27, no.4,pp.857–871,1971.

© 2022,
Certified
| Page1254
IRJET | Impact Factor value: 7.529 | ISO 9001:2008
Journal

Turn static files into dynamic content formats.

Create a flipbook