Prediction of Higher Education Student Dropout based on Regularized Regression Models

Authors

  • Bouchra Bouihi L2IACS Laboratory, ENSET Mohammedia, Morocco
  • Abdelmajid Bousselham L2IACS Laboratory, ENSET Mohammedia, Morocco https://orcid.org/0000-0001-5458-2294
  • Essaadia Aoula L2IACS Laboratory, ENSET Mohammedia, Morocco
  • Fatna Ennibras L2IACS Laboratory, ENSET Mohammedia 28830, Morocco
  • Adel Deraoui Casablanca-Settat Regional Center for Education and Training, Morocco
Volume: 14 | Issue: 6 | Pages: 17811-17815 | December 2024 | https://doi.org/10.48084/etasr.8644

Abstract

This study explores the critical topic of student dropout in higher education institutions. To allow early and precise interventions and to provide a multifaceted view of student performance, this study combined two predictive models for dropout classification and score prediction. At first, a logistic regression model was developed to predict student dropout at an early stage. Then, to enhance dropout prediction, a second-degree polynomial regression model was used to predict student results based on available academic variables (access, tests, exams, projects, and assignments) from a Moodle course. Dealing with a limited dataset is a key challenge due to the high risk of overfitting. To address this issue and achieve a balance between overfitting, data size, and model complexity, the predictive models were evaluated with L1 (Lasso) and L2 (Ridge) regularization terms. The regularization techniques of the predictive models led to an accuracy of up to 89% and an R2 score of up to 86%.

Keywords:

logistic regression, polynomial regression, regularization, dropout prediction, lasso, ridge

Downloads

Download data is not yet available.

References

M. Alsuwaiket, A. H. Blasi, and R. A. Al-Msie’deen, "Formulating Module Assessment for Improved Academic Performance Predictability in Higher Education," Engineering, Technology & Applied Science Research, vol. 9, no. 3, pp. 4287–4291, Jun. 2019.

B. Alsubhi et al., "Effective Feature Prediction Models for Student Performance," Engineering, Technology & Applied Science Research, vol. 13, no. 5, pp. 11937–11944, Oct. 2023.

S. Kim, E. Choi, Y.-K. Jun, and S. Lee, "Student Dropout Prediction for University with High Precision and Recall," Applied Sciences, vol. 13, no. 10, Jan. 2023, Art. no. 6275.

W. Hämäläinen and M. Vinni, "Classifiers for Educational Data Mining," in Handbook of Educational Data Mining, CRC Press, 2010.

L. Ji, X. Zhang, and L. Zhang, "Research on the Algorithm of Education Data Mining Based on Big Data," in 2020 IEEE 2nd International Conference on Computer Science and Educational Informatization (CSEI), Xinxiang, China, Jun. 2020, pp. 344–350.

A. E. Hoerl and R. W. Kennard, "Ridge Regression: Biased Estimation for Nonorthogonal Problems," Technometrics, vol. 12, no. 1, pp. 55–67, Feb. 1970.

J. Kabathova and M. Drlik, "Towards Predicting Student’s Dropout in University Courses Using Different Machine Learning Techniques," Applied Sciences, vol. 11, no. 7, Jan. 2021, Art. no. 3130.

S. Halawa, D. Greene, and J. Mitchell, "Dropout Prediction in MOOCs using Learner Activity Features," eLearning Papers, no. 37 (This special issue of the eLearning Papers is based on the contributions made to the EMOOCS 2014 con), 2014.

F. Ennibras, E. S. Aoula, and B. Bouihi, "AI in Preventing Dropout in Distance Higher Education: A Systematic Literature Review," in 2024 4th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), FEZ, Morocco, May 2024, pp. 1–7.

J. J. da Silva and N. T. Roman, "Predicting Dropout in Higher Education: a Systematic Review," in Simpósio Brasileiro de Informática na Educação (SBIE), Nov. 2021, pp. 1107–1117.

G. Gray, C. McGuinness, and P. Owende, "An application of classification models to predict learner progression in tertiary education," in 2014 IEEE International Advance Computing Conference (IACC), Gurgaon, India, Feb. 2014, pp. 549–554.

B. R. Cuji Chacha, W. L. Gavilanes López, V. X. Vicente Guerrero, and W. G. Villacis Villacis, "Student Dropout Model Based on Logistic Regression," in Applied Technologies, Quito, Ecuador, 2020, pp. 321–333.

M. Vaarma and H. Li, "Predicting student dropouts with machine learning: An empirical study in Finnish higher education," Technology in Society, vol. 76, Mar. 2024, Art. no. 102474.

A. B. Altamimi, "Big Data in Education: Students at Risk as a Case Study," Engineering, Technology & Applied Science Research, vol. 13, no. 5, pp. 11705–11714, Oct. 2023.

L. Kemper, G. Vorhoff, and B. U. Wigger, "Predicting student dropout: A machine learning approach," European Journal of Higher Education, vol. 10, no. 1, pp. 28–47, Jan. 2020.

Y. Yang, "Sparse Logistic Regression with the Hybrid L1/2+1 Regularization," in 2021 6th International Conference on Mathematics and Artificial Intelligence, Chengdu, China, Mar. 2021, pp. 8–13.

P. Dabhade, R. Agarwal, K. P. Alameen, A. T. Fathima, R. Sridharan, and G. Gopakumar, "Educational data mining for predicting students’ academic performance using machine learning algorithms," Materials Today: Proceedings, vol. 47, pp. 5260–5267, Jan. 2021.

A. Kukkar, R. Mohana, A. Sharma, and A. Nayyar, "A novel methodology using RNN + LSTM + ML for predicting student’s academic performance," Education and Information Technologies, vol. 29, no. 11, pp. 14365–14401, Aug. 2024.

Q. Huang and Y. Zeng, "Improving academic performance predictions with dual graph neural networks," Complex & Intelligent Systems, vol. 10, no. 3, pp. 3557–3575, Jun. 2024.

Q. Huang and Y. Zeng, "Improving academic performance predictions with dual graph neural networks," Complex & Intelligent Systems, vol. 10, no. 3, pp. 3557–3575, Jun. 2024.

Y. Yamasari, N. Rochmawati, R. E. Putra, A. Qoiriah, Asmunin, and W. Yustanti, "Predicting the Students Performance using Regularization-based Linear Regression," in 2021 Fourth International Conference on Vocational Education and Electrical Engineering (ICVEE), Surabaya, Indonesia, Oct. 2021, pp. 1–5.

O. W. Adejo and T. Connolly, "Predicting student academic performance using multi-model heterogeneous ensemble approach," Journal of Applied Research in Higher Education, vol. 10, no. 1, pp. 61–75, Jan. 2018.

E. Evangelista and B. Sy, "An approach for improved students’ performance prediction using homogeneous and heterogeneous ensemble methods," International Journal of Electrical and Computer Engineering, vol. 12, no. 5, pp. 5226–5235, Oct. 2022.

U. Michelucci, "Logistic Regression from Scratch," in Applied Deep Learning: A Case-Based Approach to Understanding Deep Neural Networks, U. Michelucci, Ed. Berkeley, CA, USA: Apress, 2018, pp. 391–401.

K. H. Pho, S. Ly, S. Ly, and T. M. Lukusa, "Comparison among Akaike Information Criterion, Bayesian Information Criterion and Vuong’s test in Model Selection: A Case Study of Violated Speed Regulation in Taiwan," Journal of Advanced Engineering and Computation, vol. 3, no. 1, pp. 293–303, Mar. 2019.

Downloads

How to Cite

[1]
Bouihi, B., Bousselham, A., Aoula, E., Ennibras, F. and Deraoui, A. 2024. Prediction of Higher Education Student Dropout based on Regularized Regression Models. Engineering, Technology & Applied Science Research. 14, 6 (Dec. 2024), 17811–17815. DOI:https://doi.org/10.48084/etasr.8644.

Metrics

Abstract Views: 133
PDF Downloads: 158

Metrics Information