Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model by IRJET Journal

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 02 | Feb 2024

p-ISSN: 2395-0072

www.irjet.net

Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model Basil Jacob 1 1Basil Jacob, School of Computer Science and Engineering, Vellore Institute of Technology,

Vellore – 632014, Tamil Nadu, India ---------------------------------------------------------------------***---------------------------------------------------------------------

Abstract – Cardiovascular disease (CVD) stands as a

prompt and accurate diagnostic solutions to enable timely interventions. Recognizing this imperative, the quest for a diagnostic system rooted in machine learning classifiers becomes paramount.

formidable global health challenge, exerting a significant impact on worldwide morbidity and mortality rates. Traditional monitoring methods often prove inadequate in capturing the intricate and dynamic nature of cardiovascular health, impeding healthcare professionals' ability to discern subtle patterns preceding cardiac events. This study seeks to revolutionize heart disease monitoring by leveraging the prowess of well-established machine learning algorithms, including Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Naive Bayes, Decision Tree, Gradient Boosting, and Random Forest. Diverging from conventional single-model approaches, our work embraces ensemble learning, synergizing the strengths of standalone models—SVM, KNN, Naive Bayes, Decision Tree—with ensemble models like Gradient Boosting and Random Forest. Through the amalgamation of these models, we introduce a novel ensemble learning approach, named the SKNDGR model, which achieves an outstanding accuracy of 99.19%, surpassing the performance of all other models. This exceptional outcome is attributed to the model's capacity to establish diverse, robust, and non-linear decision boundaries, necessitating minimal hyperparameter tuning and ensuring computational efficiency. The research holds the promise of swift heart disease detection, facilitating timely treatment interventions, and showcases advantages such as high performance, accuracy rates, flexibility, and an elevated success rate. The proposed model provides a high degree of precision and recall, enabling researchers to obtain the most accurate results when diagnosing patients suffering from heart disease.

In the intricate tapestry of healthcare, precision becomes the linchpin between life and death. Here, data preprocessing and standardization emerge as silent architects, akin to a skilled detective organizing clues to solve a complex case. Healthcare professionals depend on the seamless harmony of standardized data, unlocking crucial insights for informed decisions in the pursuit of optimal patient care. Some commonly used standardization techniques, including StandardScaler (SS) and Min-Max Scaler, are effective in rescaling numerical features. However, they do not handle missing feature values directly, and it's necessary to address missing values through imputation or other methods before applying these standardization techniques. Achieving peak performance in machine learning models demands the strategic utilization of balanced datasets during both training and testing phases. Furthermore, refining predictive capabilities hinges on the incorporation of relevant features, underscoring the critical importance of optimizing model performance through data balancing and meticulous feature selection. In this high-stakes landscape, the development of advanced diagnostic tools powered by machine learning classifiers stands as a beacon of hope, promising to revolutionize the battle against CVD and reshape the future of global healthcare. In standard practice, doctors often prescribe multiple tests, creating delays in timely disease notification. This underscores the need for an approach enabling swift and timely predictions. Machine learning, a subset of artificial intelligence, operates on the premise that systems can learn from data, identify patterns, and make decisions with minimal human intervention. For heart disease prediction, researchers normally use the existing machine learning algorithms such as SVM, KNN, Decision Tree, Random Forest, Naïve Bayes, Logistic regression and so on. However, it is quite difficult to get predictions with a high degree of accuracy with these models functioning alone. Our research work aims to utilize the combined power of these models and generate a new model that returns results with a high degree of precision and accuracy.

Key Words: Cardiovascular disease, Machine Learning Algorithms, Ensemble learning, SKNDGR, Hyperparameter Tuning, Computational efficiency

1. INTRODUCTION In the relentless fight against cardiovascular diseases (CVD), where the lives of over half a billion individuals hang in the balance worldwide, the urgency is glaringly apparent. The World Health Federation's 2023 report paints a grim picture, revealing 20.5 million CVD-related deaths in 2021—nearly one-third of all global fatalities—an alarming surge from the estimated 121 million CVD deaths. This ominous trend, fueled by unhealthy lifestyles, heightened stress levels, and environmental factors, emphasizes the critical necessity for

Impact Factor value: 8.226

ISO 9001:2008 Certified Journal

Page 1