Adaptive Risk-Stratified Stacking for Ten-Year Cardiovascular Disease Prediction with SHAP Interpretability

Kanda Sorn-In; Wirapong Chansanam; Pathamakorn Netayawijit

doi:10.48084/etasr.16262

Authors

Kanda Sorn-In Department of Technology and Engineering, Faculty of Interdisciplinary Studies, Khon Kaen University, Nong Khai Campus, Nong Khai, Thailand https://orcid.org/0009-0003-3595-0858
Wirapong Chansanam Department of Information Science, Faculty of Humanities and Social Sciences, Khon Kaen University, Khon Kaen, Thailand https://orcid.org/0000-0001-5546-8485
Pathamakorn Netayawijit Department of Information Systems, Faculty of Business Administration and Information Technology, Rajamangala University of Technology Isan, Khon Kaen Campus, Khon Kaen, Thailand https://orcid.org/0009-0001-1424-5725

Volume: 16 | Issue: 1 | Pages: 32137-32147 | February 2026 | https://doi.org/10.48084/etasr.16262

Received: 13 November 2025 | Revised: 6 December 2025 and 21 December 2025 | Accepted: 24 December 2025 | Online: 9 February 2026

Corresponding author: Pathamakorn Netayawijit

Abstract

Cardiovascular Disease (CVD) remains the leading cause of death worldwide, accounting for over 17.9 million deaths annually. Traditional risk assessment tools such as the Framingham Risk Score and Atherosclerotic Cardiovascular Disease (ASCVD) calculator are constrained by linear assumptions and limited variables, often failing to capture complex interactions among clinical and behavioral factors. To overcome these limitations, this study proposes an Adaptive Risk-Stratified Stacking (ARSS) framework that integrates ensemble learning, Explainable Artificial Intelligence (XAI), and Bayesian uncertainty quantification for ten-year CVD prediction. Using data from the Framingham Heart Study (FHS) (n = 4,240; 16 features), the framework combines Random Forest, Extreme Gradient Boosting (XGBoost), and Logistic Regression as base learners, with a Logistic Regression meta-classifier trained using five-fold stratified cross-validation. The adaptive stratification mechanism enables subgroup-specific learning across low-, intermediate-, and high-risk cohorts, enhancing personalization and sensitivity. The ARSS model achieved 89.6% accuracy, an F1-score of 0.89, and an area under the receiver operating characteristic curve (ROC–AUC) of 0.918 (95% Confidence Interval (CI): 0.907–0.929), significantly outperforming baseline models (p < 0.01, Cohen's d ≥ 0.71). Calibration analysis indicated strong reliability (Brier Score = 0.076), whereas Shapley Additive Explanations (SHAP)-based interpretability revealed clinically consistent feature interactions such as Age × Systolic Blood Pressure and Diabetes × Glucose, reinforcing the model's physiological plausibility. Bayesian uncertainty estimation further enhanced confidence in predictive reliability and transparency. Overall, the proposed ARSS framework demonstrates that interpretable, risk-stratified ensemble learning can bridge predictive accuracy with clinical trustworthiness, establishing a unified and ethical paradigm for XAI in precision cardiovascular prevention.

Keywords:

cardiovascular disease prediction, adaptive ensemble learning, explainable artificial intelligence, interpretable machine learning, Bayesian uncertainty quantification

Downloads

Download data is not yet available.

References

"Cardiovascular diseases (CVDs)." World Health Organization. https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds).

"Fast Facts: Health and Economic Costs of Chronic Conditions." U.S. Centers for Disease Control and Prevention. https://www.cdc.gov/chronic-disease/data-research/facts-stats/index.html.

C. Xu, F. Shi, W. Ding, C. Fang, and C. Fang, "Development and validation of a machine learning model for cardiovascular disease risk prediction in type 2 diabetes patients," Scientific Reports, vol. 15, no. 1, Sept. 2025, Art. no. 328318. DOI: https://doi.org/10.1038/s41598-025-18443-7

K. Nezamabadi et al., "Explainable artificial intelligence identifies and localizes left ventricular scar in hypertrophic cardiomyopathy using 12-Lead electrocardiogram," Scientific Reports, vol. 15, no. 1, Sept. 2025, Art. no. 33918. DOI: https://doi.org/10.1038/s41598-025-09282-7

P. Mahajan, S. Uddin, F. Hajati, M. A. Moni, and E. Gide, "A comparative evaluation of machine learning ensemble approaches for disease prediction using multiple datasets," Health and Technology, vol. 14, no. 3, pp. 597–613, May 2024. DOI: https://doi.org/10.1007/s12553-024-00835-w

V. Swamulu, S. Moturi, S. N. Tirumala Rao, and M. Mounika Naga Bhavani, "Predicting Heart Disease: A Comprehensive Evaluation of Machine Learning Algorithms," in International Conference on Advances in Data-driven Computing and Intelligent Systems, Goa, India, 2024, pp. 423–439. DOI: https://doi.org/10.1007/978-981-96-5370-6_31

N. R. Khan, S. Verma, H. Kumar, M. M. Panda, A. Dwivedi, and A. K. Mishra, "Predicting Cardiovascular Disease Risk Using Tree-Based Gradient Boosting Machine Learning Techniques," in International Conference on Modern Practices and Trends in Expert Applications and Security, Bhopal, India, 2024, pp. 169–178. DOI: https://doi.org/10.1007/978-981-96-5781-0_15

O. Bilal, A. Hekmat, I. Shahzad, A. Raza, and S. U. R. Khan, "Boosting Machine Learning Accuracy for Cardiac Disease Prediction: The Role of Advanced Feature Engineering and Model Optimization," The Review of Socionetwork Strategies, vol. 19, no. 2, pp. 271–300, Oct. 2025. DOI: https://doi.org/10.1007/s12626-025-00190-w

V. P. Jayachitra, M. Thasneem Fathima, V. Harsha Vardhini, and R. S. Preetha Raai, "A Hybrid Feature Selection Model for Early Heart Attack Prediction Using IoMT Devices," in Ninth International Conference on Information and Communication Technology for Competitive Strategies, Jaipur, India, 2024, pp. 425–435. DOI: https://doi.org/10.1007/978-981-96-5604-2_36

R. Goyal, D. Anand, L. Mukhija, S. Juneja, and S. Atwal, "A Precise Prediction of Cardiovascular Disease Using Machine Learning-Based Ensemble Model," in Eighth International Conference on Microelectronics and Telecommunication Engineering, Ghaziabad, India, 2025, pp. 331–342. DOI: https://doi.org/10.1007/978-981-96-6515-0_24

A. V. Kalpana, C. Vimala, C. Subramani, S. Suchitra, J. Shobana, and K. Arthi, "Optimized Hyperparameter-Tuned Ensemble Model for Heart Disease Prediction Using Enhanced Correlation Techniques," in Eighth International Conference on Innovative Computing and Communication, Delhi, India, 2025, pp. 345–362. DOI: https://doi.org/10.1007/978-981-96-7134-2_25

K. Mridha et al., "Implementing a Heart Disease Prediction Model with Explainable Machine Learning Techniques," SN Computer Science, vol. 6, no. 7, Sept. 2025, Art. no. 861. DOI: https://doi.org/10.1007/s42979-025-04409-z

K. Adalarasu, B. Raghavan, B. Madhavan, S. Venkatesh, and R. Amirtharajan, "An explainable machine learning (XAI) framework to enhance types of cardiovascular disease diagnosis and prognosis," Physical and Engineering Sciences in Medicine, Sept. 2025. DOI: https://doi.org/10.1007/s13246-025-01653-8

A. Q. Sofi, M. Sharma, T. A. Teli, and R. Kumar, "An effective deep learning-based ensemble model for heart disease prediction," Soft Computing, vol. 29, no. 21, pp. 5893–5923, Nov. 2025. DOI: https://doi.org/10.1007/s00500-025-10907-2

P. J. T. Kampen et al., "Uncertainty-Aware Classification: A Human-Guided Bayesian Deep Learning Framework," in 7th Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, Daejon, South Korea, 2026, pp. 204–213. DOI: https://doi.org/10.1007/978-3-032-06593-3_19

L. O. Joel, W. Doorsamy, and B. S. Paul, "A comparative study of imputation techniques for missing values in healthcare diagnostic datasets," International Journal of Data Science and Analytics, vol. 20, no. 7, pp. 6357–6373, Nov. 2025. DOI: https://doi.org/10.1007/s41060-025-00825-9

T. R. Dawber, G. F. Meadors, and F. E. Moore, "Epidemiological Approaches to Heart Disease: The Framingham Study," American Journal of Public Health and the Nations Health, vol. 41, no. 3, pp. 279–286, Mar. 1951. DOI: https://doi.org/10.2105/AJPH.41.3.279

"About the Framingham Heart Study." Framingham Heart Study. https://www.framinghamheartstudy.org/fhs-about/.

V. V. R. Karna, V. R. Karna, V. Janamala, V. N. K. R. Devana, V. R. S. Ch, and A. B. Tummala, "A Comprehensive Review on Heart Disease Risk Prediction using Machine Learning and Deep Learning Algorithms," Archives of Computational Methods in Engineering, vol. 32, no. 3, pp. 1763–1795, Apr. 2025. DOI: https://doi.org/10.1007/s11831-024-10194-4

G. Yang, G. Wang, L. Wan, X. Wang, and Y. He, "Utilizing SMOTE-TomekLink and machine learning to construct a predictive model for elderly medical and daily care services demand," Scientific Reports, vol. 15, no. 1, Mar. 2025, Art. no. 8446. DOI: https://doi.org/10.1038/s41598-025-92722-1

X. Zhang, S. Lin, Q. Zeng, L. Peng, and C. Yan, "Machine learning and SHAP value interpretation for predicting cardiovascular disease risk in patients with diabetes using dietary antioxidants," Frontiers in Nutrition, vol. 12, July 2025, Art. no. 1612369. DOI: https://doi.org/10.3389/fnut.2025.1612369

K. Sorn-In, W. Chansanam, and P. Netayawijit, "Anonymized heart disease dataset derived from the Framingham Heart Study for machine learning analysis." Zenodo, Dec. 18, 2025.

R. Bharti, A. Khamparia, M. Shabaz, G. Dhiman, S. Pande, and P. Singh, "Prediction of Heart Disease Using a Combination of Machine Learning and Deep Learning," Computational Intelligence and Neuroscience, vol. 2021, no. 1, July 2021, Art. no. 8387680. DOI: https://doi.org/10.1155/2021/8387680

D. Hassan, H. I. Hussein, and M. M. Hassan, "Heart disease prediction based on pre-trained deep neural networks combined with principal component analysis," Biomedical Signal Processing and Control, vol. 79, Jan. 2023, Art. no. 104019. DOI: https://doi.org/10.1016/j.bspc.2022.104019

A. U. Rahman, Y. Alsenani, A. Zafar, K. Ullah, K. Rabie, and T. Shongwe, "Enhancing heart disease prediction using a self-attention-based transformer model," Scientific Reports, vol. 14, no. 1, Jan. 2024, Art. no. 514. DOI: https://doi.org/10.1038/s41598-024-51184-7

H. El-Sofany, B. Bouallegue, and Y. M. A. El-Latif, "A proposed technique for predicting heart disease using machine learning algorithms and an explainable AI method," Scientific Reports, vol. 14, no. 1, Oct. 2024, Art. no. 23277. DOI: https://doi.org/10.1038/s41598-024-74656-2

S. M. Ganie, P. K. D. Pramanik, and Z. Zhao, "Ensemble learning with explainable AI for improved heart disease prediction based on multiple datasets," Scientific Reports, vol. 15, no. 1, Apr. 2025, Art. no. 13912. DOI: https://doi.org/10.1038/s41598-025-97547-6

V. Sitharamulu, S. M. Maturi, M. Murugesan, M. R. Dudekula, and H. R. Battu, "Efficient Machine Learning Algorithms for Cardiovascular Risk Prediction," Engineering, Technology & Applied Science Research, vol. 15, no. 5, pp. 27993–27999, Oct. 2025. DOI: https://doi.org/10.48084/etasr.12795

A. F. Tasnim et al., "Explainable Machine Learning Algorithms to Predict Cardiovascular Strokes," Engineering, Technology & Applied Science Research, vol. 15, no. 1, pp. 20131–20137, Feb. 2025. DOI: https://doi.org/10.48084/etasr.9152

I. D. Mienye and N. Jere, "Optimized Ensemble Learning Approach with Explainable AI for Improved Heart Disease Prediction," Information, vol. 15, no. 7, July 2024, Art. no. 394. DOI: https://doi.org/10.3390/info15070394

P. Shah, M. Shukla, N. H. Dholakia, and H. Gupta, "Predicting cardiovascular risk with hybrid ensemble learning and explainable AI," Scientific Reports, vol. 15, no. 1, May 2025, Art. no. 17927. DOI: https://doi.org/10.1038/s41598-025-01650-7

W. Chansanam and K. Tuamsuk, "Thai Twitter Sentiment Analysis: Performance Monitoring of Politics in Thailand using Text Mining Techniques," International Journal of Innovation, Creativity and Change, vol. 11, no. 12, pp. 436–452, Dec. 2020.