Temporal Validation of Machine Learning Models Utilized for Mental Health Prediction in College Students: A Three-Year Longitudinal Study
Received: 22 January 2026 | Revised: 21 February 2026 and 25 March 2026 | Accepted: 27 March 2026 | Online: 26 May 2026
Corresponding author: Alua Myrzakerimova
Abstract
This study investigates the longitudinal robustness of mental health prediction models using three years of data from the Healthy Minds Study, comprising 265,870 college students surveyed between 2022 and 2025. Multiple statistical techniques were applied to assess data drift, while three Machine Learning (ML) algorithms, namely Logistic Regression (LR), Random Forest (RF), and XGBoost, were evaluated under several temporal modeling strategies. Mental health risk was defined as the presence of moderate-to-severe depression or anxiety symptoms. Although statistically significant distributional changes were observed across variables, effect sizes remained small, indicating limited practical drift. Model performance remained strong over time (mean F1-score = 0.71, mean AUROC = 0.80), with minimal temporal degradation (1.8%). Well-being emerged as the most influential predictor, accounting for the dominant share of feature importance. These findings suggest that mental health prediction models can be reliably deployed when temporal stability is present and highlight the crucial role of well-being in mental health risk prediction.
Keywords:
machine learning, mental health, temporal validation, data drift, college students, depression, anxiety, well-beingReferences
D. Eisenberg, S. K. Lipson, J. Heinze, and S. Zhou, "The Health Minds Study: 2024-2025 Data Report.", Health Minds Network.
S. K. Lipson et al., "Trends in college student mental health and help-seeking by race/ethnicity: Findings from the national healthy minds study, 2013–2021," Journal of Affective Disorders, vol. 306, pp. 138–147, June 2022.
J. Hunt and D. Eisenberg, "Mental Health Problems and Help-Seeking Behavior Among College Students," Journal of Adolescent Health, vol. 46, no. 1, pp. 3–10, Jan. 2010.
D. Eisenberg, M. F. Downs, E. Golberstein, and K. Zivin, "Stigma and Help Seeking for Mental Health Among College Students," Medical Care Research and Review, vol. 66, no. 5, pp. 522–541, Oct. 2009.
A. B. R. Shatte, D. M. Hutchinson, and S. J. Teague, "Machine learning in mental health: a scoping review of methods and applications," Psychological Medicine, vol. 49, no. 9, pp. 1426–1448, July 2019.
D. B. Dwyer, P. Falkai, and N. Koutsouleris, "Machine Learning Approaches for Clinical Psychology and Psychiatry," Annual Review of Clinical Psychology, vol. 14, pp. 91–118, May 2018.
J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM Comput. Surv., vol. 46, no. 4, pp. 44:1-44:37, Nov. 2014.
J. Lu, A. Liu, F. Dong, F. Gu, J. Gama, and G. Zhang, "Learning under Concept Drift: A Review," IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 12, pp. 2346–2363, Dec. 2019.
S. E. Davis, T. A. Lasko, G. Chen, E. D. Siew, and M. E. Matheny, "Calibration drift in regression and machine learning models for acute kidney injury," Journal of the American Medical Informatics Association, vol. 24, no. 6, pp. 1052–1061, Nov. 2017.
P. C. Austin, D. van Klaveren, Y. Vergouwe, D. Nieboer, D. S. Lee, and E. W. Steyerberg, "Geographic and temporal validity of prediction models: different approaches were useful to examine model performance," Journal of Clinical Epidemiology, vol. 79, pp. 76–85, Nov. 2016.
X. Wang, S. Hegde, C. Son, B. Keller, A. Smith, and F. Sasangohar, "Investigating Mental Health of US College Students During the COVID-19 Pandemic: Cross-Sectional Survey Study," Journal of Medical Internet Research, vol. 22, no. 9, Sept. 2020, Art. no. e22817.
C. Son, S. Hegde, A. Smith, X. Wang, and F. Sasangohar, "Effects of COVID-19 on College Students’ Mental Health in the United States: Interview Survey Study," Journal of Medical Internet Research, vol. 22, no. 9, Sept. 2020, Art. no. e21279.
C. G. Walsh, J. D. Ribeiro, and J. C. Franklin, "Predicting Risk of Suicide Attempts Over Time Through Machine Learning," Clinical Psychological Science, vol. 5, no. 3, pp. 457–469, May 2017.
T. A. Burke et al., "Identifying the relative importance of non-suicidal self-injury features in classifying suicidal ideation, plans, and behavior using exploratory data mining," Psychiatry Research, vol. 262, pp. 175–183, Apr. 2018.
K. Kroenke, R. L. Spitzer, and J. B. W. Williams, "The PHQ-9: Validity of a Brief Depression Severity Measure," Journal of General Internal Medicine, vol. 16, no. 9, pp. 606–613, 2001.
R. L. Spitzer, K. Kroenke, J. B. W. Williams, and B. Löwe, "A Brief Measure for Assessing Generalized Anxiety Disorder: The GAD-7," Archives of Internal Medicine, vol. 166, no. 10, pp. 1092–1097, 2006.
E. Diener et al., "New Well-being Measures: Short Scales to Assess Flourishing and Positive and Negative Feelings," Social Indicators Research, vol. 97, no. 2, pp. 143–156, May 2009.
R. L. Wasserstein and N. A. Lazar, "The ASA Statement on p-Values: Context, Process, and Purpose," The American Statistician, vol. 70, no. 2, pp. 129–133, 2016.
F. J. Massey Jr., "The Kolmogorov-Smirnov Test for Goodness of Fit," Journal of the American Statistical Association, vol. 46, no. 253, pp. 68–78, 1951.
J. Cohen, Statistical Power Analysis for the Behavioral Sciences, 2nd ed. New York, NY, USA: Lawrence Erlbaum Associates, 1988.
N. Siddiqi, Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring, Hoboken, NJ, USA: John Wiley & Sons, 2006.
L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug. 2016.
D. Lakens, "Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs," Frontiers in Psychology, vol. 4, Nov. 2013.
G. Widmer and M. Kubat, "Learning in the Presence of Concept Drift and Hidden Contexts," Machine Learning, vol. 23, no. 1, pp. 69–101, Apr. 1996.
B. Krawczyk, L. L. Minku, J. Gama, J. Stefanowski, and M. Woźniak, "Ensemble learning for data stream analysis: A survey," Information Fusion, vol. 37, pp. 132–156, Sept. 2017.
D. Eisenberg, E. Golberstein, and J. B. Hunt, "Mental Health and Academic Success in College," The B.E. Journal of Economic Analysis & Policy, vol. 9, no. 1, Art. no. 40.
K. Zivin, D. Eisenberg, S. E. Gollust, and E. Golberstein, "Persistence of mental health problems and needs in a college student population," Journal of Affective Disorders, vol. 117, no. 3, pp. 180–185, Oct. 2009.
S. K. Lipson, A. Kern, D. Eisenberg, and A. M. Breland-Noble, "Mental Health Disparities Among College Students of Color," Journal of Adolescent Health, vol. 63, no. 3, pp. 348–356, Sept. 2018.
Downloads
How to Cite
License
Copyright (c) 2026 Tolkyn Tuleutayeva, Alua Myrzakerimova, M. O. Nurmaganbetova, N. Nalgozhina

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
