Issuu

International Research Journal of Engineering and Technology (IRJET)

e-ISSN: 2395-0056

Volume: 11 Issue: 11 | Nov 2024

p-ISSN: 2395-0072

www.irjet.net

“Optimizing Student Performance Prediction with Supervised Learning” Manjula KalmathAsst.Prof JSS Shri Manjunatheshwara MCA Institute, Vidyagiri, Dharwad 04 Affiliated with Karnataka University Dharwad City: Dharwad, State: Karnataka, Country: India ----------------------------------------------------------------------***---------------------------------------------------------------------

Abstract - Accurately predicting student performance is

investigates the relationship between various factors, such as internal assessments, seminar engagement, assignment completion, attendance, and extracurricular activities, and their influence on students' final academic results. By employing historical data and machine learning techniques, this research aims to develop a predictive model that accurately estimates final student grades based on these internal academic indicators. The ability to precisely forecast student performance enables educators and institutions to implement timely interventions and tailor learning strategies to individual needs.

crucial for effective educational interventions and resource allocation. This study compares prominent supervised machine learning algorithms for forecasting student achievement, addressing the lack of comprehensive evaluations in existing research. Analyzing a dataset of 60 students from JSS SMI UG and PG Studies, Dharwad, India, the research incorporates student demographic information, academic performance indicators, engagement metrics, and final performance results to identify factors influencing academic success. The study evaluates several models, including logistic regression, decision trees, random forest, support vector machine (SVM), k-nearest neighbors (KNN), and Naive Bayes algorithms. A 5-fold cross-validation technique assesses model performance more reliably than a simple train_test_split, enhancing the evaluation of the model's generalizability and mitigating overfitting risks. Results indicate that the decision tree classifier achieves a mean accuracy of approximately 0.967, with most folds demonstrating perfect accuracy, although it may be susceptible to overfitting. The random forest classifier exhibits a mean accuracy of around 0.883, showing slightly more variability but greater robustness. KNN attains a mean accuracy of approximately 0.7667, while Gaussian Naive Bayes (NB) demonstrates consistent accuracy with a mean of around 0.817. Logistic regression displays lower and less consistent accuracy at approximately 0.583. SVC exhibits the lowest accuracy at 0.45, suggesting that it might not be well-suited for the dataset. This study provides valuable insights into selecting and applying appropriate machine learning techniques for predicting student performance, potentially benefiting educators and researchers in the field.

The primary objective of this research is to assess the efficacy of various supervised learning algorithms in predicting student performance. A real-world dataset will be utilized, incorporating features such as gender, assignment scores, seminar participation, internal marks, attendance, and overall average, with the student's grade as the target variable. While numerous supervised learning methods offer potential solutions, there is a lack of comprehensive understanding regarding the comparative advantages and disadvantages of different approaches in this field. Selecting the most appropriate algorithm for a specific educational context remains challenging. This study addresses this gap by conducting a comprehensive comparative analysis of several algorithms, including logistic regression, decision trees, random forest, SVM, KNN, and Naive Bayes. To evaluate the performance of these algorithms, 5-fold cross-validation will be utilized, ensuring robust assessment and minimizing the risk of overfitting. The algorithms will be compared using performance metrics such as accuracy, precision, recall, and F1 score, which will be calculated using a confusion_matrix.

Key Words: Supervised machine learning algorithms, Academic performance, 5-Fold crossvalidation, Confusion matrix, Accuracy, Prediction, Classification Report.

Additionally, the study will evaluate the models by predicting grades for new, unseen data, visualizing the predictions with various output plots such as confusion_matrix and heatmap. These visualizations will aid in interpreting the effectiveness of each algorithm in predicting student performance. By elucidating the strengths and limitations of different machine learning models in educational settings, this research aims to

1.INTRODUCTION : In education, predicting student academic performance is essential for identifying at-risk students and enhancing educational outcomes. This study

Impact Factor value: 8.315

ISO 9001:2008 Certified Journal

Page 402