International Research Journal of Engineering and Technology (IRJET)
e-ISSN: 2395-0056
Volume: 12 Issue: 11 | Nov 2025
p-ISSN: 2395-0072
www.irjet.net
Brain Storke Prediction Using Advance Machine Learning Amruta Biradar1, Mayuri Kamble2, Kirti Karajangi3, Snehal Kale4, Ms.M.T.Naik5 *1,2,3,4,5 Department of Computer Science & Engineering, Yadrav (Ichalkaranji) Maharashtra, India.*6 Assistant
Professor, Department of Computer Science & Engineering, Yadrav (Ichalkaranji) Maharashtra, India ---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Stroke is one of the main causes of death and
interpretability, and limited generalization across diverse populations. Most systems struggle to maintain accuracy when applied to real-world healthcare environments, where patient data vary in scale, quality, and complexity. Consequently, there remains an urgent need for a robust, adaptive, and explainable prediction framework. The present research focuses on developing an Advanced Machine Learning Model for Brain Stroke Prediction, designed to enhance both accuracy and clinical relevance. The specific objectives of this study are:
long-term disability around the world. Detecting the risk of stroke early can help save lives and reduce serious health problems. In this study, we developed a machine learningbased model that can predict who is at risk of having a stroke. We used advanced algorithms like Random Forest, XG-Boost, Support Vector Machine (SVM), and Gradient Boosting. To make the predictions more accurate, we cleaned and prepared the data using techniques like normalization and feature selection. Our results show that combined (ensemble) models work better than traditional methods for predicting stroke risk. This model can help doctors detect strokes early, prevent them, and plan better treatment for patients.
Key Words: Brain Stroke, Machine Learning, Prediction, Ensemble Models, Healthcare, Data Analytics
To analyze and identify the most influential medical and lifestyle risk factors associated with brain stroke.
To implement and compare multiple ML algorithms (such as Random Forest, XG-Boost, and Neural Networks) to determine the best-performing model.
To optimize data preprocessing and feature selection techniques for improved prediction accuracy.
To design a user-interactive predictive system that can assist healthcare professionals in making timely preventive decisions.
1.INTRODUCTION Stroke is a serious health issue and one of the leading causes of death worldwide. In the United States, it is the fifth most common cause of death, affecting over 795,000 people every year. In India, it ranks fourth. Detecting stroke risk early can help prevent severe consequences, and with modern technology, Machine Learning can help predict who is at risk. While many studies have focused on predicting heart-related strokes, fewer have looked specifically at brain strokes. In this study, we focus on predicting brain stroke using Machine Learning. We tested six different algorithms and found that Naïve Bayes gave the most accurate results. However, our model is based on health data (like age, blood pressure, etc.) rather than real-time brain images, which is a limitation. We used a dataset from Kaggle containing various health-related information. The data was cleaned and prepared through preprocessing steps like filling missing values, converting text into numbers, and encoding categories. Then, the data was split into training and testing sets, and models were built using different algorithms. Their accuracy was compared to find the best-performing model. Finally, we created a web application using Flask, where users can enter their health details to check their stroke risk. This study shows which Machine Learning algorithm works best for predicting brain strokes and provides a base for future improvements.
1.2 LIERATURE REVIEW Many studies have shown that Machine Learning can be very useful in predicting strokes. The research mentioned below forms the basis for our study: Kumar et al. [1], proposed a stroke prediction model using Random Forest achieving 92% accuracy.[2] Patel et al. (2022) compared SVM and Decision Tree models for medical diagnosis.[3] Li et al. (2021) demonstrated feature selection using Chi-Square test for improved model performance.[4] Ahmed et al. (2023) utilized ensemble learning methods like XG-Boost for early disease prediction.[5] Singh et al. (2024) emphasized the role of balanced datasets in clinical predictions.[6] Sharma & Gupta (2022) highlighted data preprocessing for handling missing values in healthcare datasets.[7] Zhao et al. (2021) used gradient boosting techniques for cardiovascular disease prediction.[8] Chatterjee et al. (2023) proposed an IoTintegrated ML model for real-time stroke detection.[9] Khalid et al. (2024) reviewed performance evaluation metrics for medical ML systems.
1.1 PROBLEM STATEMENT & OBJECTIVES Although numerous predictive models exist, their performance often suffers from imbalanced datasets, lack of
© 2025, IRJET
|
Impact Factor value: 8.315
|
ISO 9001:2008 Certified Journal
|
Page 228