Prediction of Myocardial Infarction Complications using Gradient Boosting
Received: 23 September 2024 | Revised: 18 October 2024 | Accepted: 26 October 2024 | Online: 2 December 2024
Corresponding author: Gamal Saad Mohamed Khamis
Abstract
Cardiovascular diseases (CVDs) are the leading cause of death worldwide, representing a significant public health challenge. Myocardial Infarction (MI), a severe manifestation of CVDs, contributes substantially to these fatalities. Machine learning holds great promise for predicting MI. This study explores the potential of Gradient Boosting (GB) techniques for this purpose, explicitly focusing on CatBoost, LightGBM, XGBoost, and XGBoost Random Forest. The study leverages GB's embedded feature selection, missing-value handling, and hyperparameter tuning capabilities. Performance was evaluated using multiple metrics: Area Under the Curve (AUC), classification accuracy, F1 score, precision, recall, and Matthews Correlation Coefficient (MCC). A probabilistic comparison matrix was used to assess the relative performance of the GB models. The results demonstrate the superiority of CatBoost, achieving a classification accuracy of 94.9%, an AUC of 0.992, a recall of 94.9%, and an MCC of 0.82. The probabilistic comparison further confirms CatBoost's superior performance. These findings contribute to MI prediction, highlighting the predictive potential of the CatBoost algorithm and ultimately aiding the fight against MI to achieve better patient outcomes.
Keywords:
GB, myocardial infarction, prediction, machine learningDownloads
References
"Cardiovascular diseases," World Health Organization. https://www.who.int/health-topics/cardiovascular-diseases.
"Deaths from cardiovascular disease surged 60% globally over the last 30 years: Report," World Heart Federation. https://world-heart-federation.org/news/deaths-from-cardiovascular-disease-surged-60-globally-over-the-last-30-years-report/.
A. Surendran, M. Aliani, and A. Ravandi, "Metabolomic characterization of myocardial ischemia-reperfusion injury in ST-segment elevation myocardial infarction patients undergoing percutaneous coronary intervention," Scientific Reports, vol. 9, no. 1, Aug. 2019, Art. no. 11742.
S. M. Alanazi and G. S. M. Khamis, "Optimizing Machine Learning Classifiers for Enhanced Cardiovascular Disease Prediction," Engineering, Technology & Applied Science Research, vol. 14, no. 1, pp. 12911–12917, Feb. 2024.
C. Zhang, X. Lei, and L. Liu, "Predicting Metabolite–Disease Associations Based on LightGBM Model," Frontiers in Genetics, vol. 12, Apr. 2021.
J. Cao et al., "Combined metabolomics and machine learning algorithms to explore metabolic biomarkers for diagnosis of acute myocardial ischemia," International Journal of Legal Medicine, vol. 137, no. 1, pp. 169–180, Jan. 2023.
N. E. Moskaleva et al., "Target Metabolome Profiling-Based Machine Learning as a Diagnostic Approach for Cardiovascular Diseases in Adults," Metabolites, vol. 12, no. 12, Dec. 2022, Art. no. 1185.
K. Margulis, Z. Zhou, Q. Fang, R. E. Sievers, R. J. Lee, and R. N. Zare, "Combining Desorption Electrospray Ionization Mass Spectrometry Imaging and Machine Learning for Molecular Recognition of Myocardial Infarction," Analytical Chemistry, vol. 90, no. 20, pp. 12198–12206, Oct. 2018.
E. Panteris et al., "Machine Learning Algorithm to Predict Obstructive Coronary Artery Disease: Insights from the CorLipid Trial," Metabolites, vol. 12, no. 9, Sep. 2022, Art. no. 816.
R. Khera et al., "Use of Machine Learning Models to Predict Death After Acute Myocardial Infarction," JAMA Cardiology, vol. 6, no. 6, pp. 633–641, Jun. 2021.
M. P. Than et al., "Machine Learning to Predict the Likelihood of Acute Myocardial Infarction," Circulation, vol. 140, no. 11, pp. 899–909, Sep. 2019.
Z. Bai et al., "Development of a machine learning model to predict the risk of late cardiogenic shock in patients with ST-segment elevation myocardial infarction," Annals of Translational Medicine, vol. 9, no. 14, Jul. 2021, Art. no. 1162.
L. Devos, W. Meert, and J. Davis, "Fast GB Decision Trees with Bit-Level Data Structures," in Machine Learning and Knowledge Discovery in Databases, Würzburg, Germany, 2020, pp. 590–606.
D. Upadhyay, J. Manero, M. Zaman, and S. Sampalli, "GB Feature Selection With Machine Learning Classifiers for Intrusion Detection on Power Grids," IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 1104–1116, Mar. 2021.
G. Madhu, B. L. Bharadwaj, G. Nagachandrika, and K. S. Vardhan, "A Novel Algorithm for Missing Data Imputation on Machine Learning," in 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, Nov. 2019, pp. 173–177.
S. E. Golovenkin et al., "Trajectories, bifurcations, and pseudo-time in large clinical datasets: applications to myocardial infarction and diabetes data," GigaScience, vol. 9, no. 11, Nov. 2020, Art. no. giaa128.
A. Satty, M. M. Y. Salih, A. A. Hassaballa, E. A. E. Gumma, A. Abdallah, and G. S. M. Khamis, "Comparative Analysis of Machine Learning Algorithms for Investigating Myocardial Infarction Complications," Engineering, Technology & Applied Science Research, vol. 14, no. 1, pp. 12775–12779, Feb. 2024.
S. E. Golovenkin et al., "Myocardial infarction complications," UCI Machine Learning Repository, 2020.
T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 785–794.
G. Ke et al., "LightGBM: A Highly Efficient GB Decision Tree," in Advances in Neural Information Processing Systems, 2017, vol. 30.
J. H. Friedman, "Greedy function approximation: A GB machine.," The Annals of Statistics, vol. 29, no. 5, pp. 1189–1232, Oct. 2001.
L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, "CatBoost: unbiased boosting with categorical features," in Advances in Neural Information Processing Systems, 2018, vol. 31.
J. H. Friedman, "Stochastic GB," Computational Statistics & Data Analysis, vol. 38, no. 4, pp. 367–378, Feb. 2002.
V. Kanaparthi, "Credit Risk Prediction using Ensemble Machine Learning Algorithms," in 2023 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, Apr. 2023, pp. 41–47.
J. T. Hancock and T. M. Khoshgoftaar, "CatBoost for big data: an interdisciplinary review," Journal of Big Data, vol. 7, no. 1, Nov. 2020, Art. no. 94.
G. Huang et al., "Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions," Journal of Hydrology, vol. 574, pp. 1029–1041, Jul. 2019.
S. B. Jabeur, C. Gharib, S. Mefteh-Wali, and W. B. Arfi, "CatBoost model and artificial intelligence techniques for corporate failure prediction," Technological Forecasting and Social Change, vol. 166, May 2021, Art. no. 120658.
M. Luo et al., "Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass," Forests, vol. 12, no. 2, Feb. 2021, Art. no. 216.
P. Anuradha and V. K. David, "Feature Selection and Prediction of Heart diseases using GB Algorithms," in 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, Mar. 2021, pp. 711–717.
Q. X. Song et al., "The machine learning model based on trajectory analysis of ribonucleic acid test results predicts the necessity of quarantine in recurrently positive patients with SARS-CoV-2 infection," Frontiers in Public Health, vol. 10, Nov. 2022.
Downloads
How to Cite
License
Copyright (c) 2024 Gamal Saad Mohamed Khamis, Zakariya M. S. Mohammed, Sultan Munadi Alanazi, Ashraf F. A. Mahmoud, Faroug A. Abdalla, Sana Abdelaziz Bkheet
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.