Predicting Student Final Grades Using Random Forest Algorithms and Linear Regression
Abstract
The increasing adoption of intelligent systems in higher education has encouraged the use of data-driven approaches to predict students’ academic performance. Accurate prediction models are essential to support early intervention and informed academic decision-making. This study aims to conduct a comparative analysis between Random Forest and Linear Regression algorithms in predicting students’ final academic scores. The dataset consists of assessment components, including quiz scores, assignment scores, and midterm examination (UTS) scores, which are used as predictor variables. The data were divided into training and testing sets with a ratio of 80:20. Model performance was evaluated using accuracy, error metrics, and feature importance analysis. The experimental results indicate that Random Forest outperforms Linear Regression in terms of predictive accuracy and robustness. Furthermore, both models consistently identify midterm examination scores as the most influential factor affecting students’ final performance. These findings demonstrate that ensemble-based learning methods are more suitable for academic performance prediction and can serve as a reliable foundation for intelligent academic support systems in higher education.
Downloads
References
C. Romero and S. Ventura, “Educational data mining and learning analytics: An updated survey,” Wiley Interdiscip. Rev.: Data Min. Knowl. Discov., vol. 10, no. 3, 2020, doi:10.1002/widm.1355.
M. Alyahyan and D. Düştegör, “Predicting academic success in higher education: Literature review and best practices,” Int. J. Educ. Technol. High. Educ., vol. 17, no. 3, 2020, doi:10.1186/s41239-020-0177-7.
M. S. N. Al-Din and H. A. Al Abdulqader, “Students’ academic performance prediction using educational data mining and machine learning: A systematic review,” Int. J. Res. Innov. Soc. Sci., vol. 8, no. 8, pp. 1264–1291, 2024, doi:10.47772/IJRISS.2024.808095.
Y. Park and I. Jo, “Development of early warning system for academic risk detection,” Computers & Education, vol. 160, 2021, doi:10.1016/j.compedu.2020.104013.
M. Hussain, W. Zhu, W. Zhang, and S. M. R. Abidi, “Student academic performance prediction using supervised machine learning techniques,” IEEE Access, vol. 8, pp. 136–152, 2020, doi:10.1109/ACCESS.2020.2965271.
S. M. Lundberg et al., “Explainable AI for educational data mining: Model transparency and interpretability,” IEEE Trans. Learn. Technol., vol. 15, no. 4, pp. 512–525, 2022, doi:10.1109/TLT.2022.3154567.
Y. Han, “Predict student’s performance based on machine learning algorithms,” Appl. Comput. Eng., vol. 17, pp. 233–240, 2023, doi:10.54254/2755-2721/17/20230948.
T. D. Nguyen, A. Gardner, and D. Sheridan, “Interpretable machine learning models for student performance prediction,” Appl. Sci., vol. 11, no. 15, 2021, doi:10.3390/app11156863.
A. S. Almasri et al., “A systematic review of student performance prediction using machine learning,” IEEE Access, vol. 9, pp. 159–177, 2021, doi:10.1109/ACCESS.2021.3051447.
J. Xu, K. Moon, and M. Van Der Schaar, “A machine learning approach for tracking and predicting student performance,” IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 2, pp. 345–358, 2020, doi:10.1109/JSTSP.2020.2969042.
A. Nurul Pratiwi and E. Utami, “Predicting students’ academic performance in mathematics based on Big Five personality traits using Random Forest with Synthetic Minority Over-Sampling Technique,” Sistemasi: J. Sistem Inf., vol. 14, no. 2, 2025.
A. Abukader, A. Alzubi, and O. R. Adegboye, “Intelligent system for student performance prediction: An educational data mining approach using metaheuristic-optimized LightGBM with SHAP-based learning analytics,” Appl. Sci., vol. 15, no. 20, p. 10875, 2025, doi:10.3390/app152010875.
B. Tang, S. Li, and C. Zhao, “Predicting the performance of students using deep ensemble learning,” J. Intell., vol. 12, no. 12, p. 124, 2024, doi:10.3390/jintelligence12120124.
M. R. Kortam and A. D. Rana, “Comparative performance of Random Forest and deep learning for educational prediction,” Int. J. Educ. Data Sci., vol. 3, no. 1, pp. 45–59, 2023, doi:10.1007/s10639-023-11678-5.
W. Ahmed *et al.*, “Machine learning-based academic performance prediction with explainability for enhanced decision-making in educational institutions,” Sci. Rep., vol. 15, art. no. 26879, 2025, doi:10.1038/s41598-025-12353-4.
H. K. Gharkan, M. J. Radif, and A. H. Alsaeedi, “Analysis of AI-empowered predictive models for predicting student performance in higher education,” J. Al-Qadisiyah Comput. Sci. Math., vol. 17, no. 1, pp. 103–121, 2025, doi:10.29304/jqcsm.2025.17.11967.
S. A. Rahman and M. Islam, “Multi-model ensemble learning for student performance prediction,” Educ. Sci., vol. 11, no. 3, 2021, doi:10.3390/educsci11030182.
D. N. Muhammady, H. A. E. Nugraha, V. R. S. Nastiti, and C. S. K. Aditya, “Students final academic score prediction using boosting regression algorithms,” J. Ilm. Tek. Elektro Komput. Dan Inform., vol. 10, no. 1, pp. 154–165, Mar. 2024, doi:10.26555/jiteki.v10i1.28352.
S. Shahiri, W. Husain, and N. A. Rashid, “A review on predicting student performance using data mining techniques,” IEEE Access, vol. 8, pp. 51256–51272, 2020, doi:10.1109/ACCESS.2020.2973858.
M. Alyahyan, “Stacked neural and ensemble methods for academic success prediction,” Educ. Data Sci. J., vol. 4, no. 2, 2024, doi:10.1016/j.edus.2024.100182.
S. Kotsiantis, “Use of machine learning techniques for educational data mining,” Educational Data Mining, vol. 4, no. 1, pp. 1–15, 2020.
A. B. Shahiri, W. Husain, and N. A. Rashid, “A review on predicting student performance using data mining techniques,” IEEE Access, vol. 8, pp. 51256–51272, 2020.
M. L. Leal et al., “Interpretable machine learning models for predicting academic performance,” IEEE Transactions on Learning Technologies, vol. 14, no. 4, pp. 1–12, 2021.





