Ensemble Learning for Diabetes Prediction: An Integration of TabNet and Neural Oblivious Decision Ensembles (NODE)

Authors

  • Majid Rahardi Department of Informatics, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
  • Ferian Fauzi Abdulloh Department of Informatics, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
  • Ahlihi Masruro Department of Informatics Engineering, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
  • Bima Pramudya Asaddulloh Department of Informatics Engineering, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
  • Afrig Aminuddin Department of Information Systems, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
  • Nafiatun Sholihah Department of Informatics, Faculty of Computer Science, Universitas Amikom Yogyakarta, Sleman, Indonesia
Volume: 15 | Issue: 6 | Pages: 30426-30431 | December 2025 | https://doi.org/10.48084/etasr.15192

Abstract

The accurate prediction of diabetes risk is paramount for advancing healthcare and personalized medicine. This study presents a comparative analysis of advanced deep learning models for structured data, focusing on two novel architectures, Neural Oblivious Decision Ensembles (NODE) and TabNet. The method encompasses comprehensive data preprocessing, including a critical technique to address the imbalanced nature of the dataset (oversampling). Finally, a combined modeling approach (a soft-voting ensemble) was implemented to combine the predictive probabilities from the trained individual models. The soft-voting ensemble demonstrated strong performance, achieving a validation accuracy of 93.55, a precision of 92.60, a recall of 94.58, and an F1-score of 93.58. These findings underscore the potential of advanced deep learning techniques, especially when combined in an ensemble, to provide highly reliable and accurate diabetes risk prediction from complex tabular data.

Keywords:

ensemble learning, deep learning, diabetes prediction, TabNet, NODE

Downloads

Download data is not yet available.

References

J. Niu, Y. Liu, H. Peng, J. Chen, and L. Chen, "Early-stage diabetic retinopathy: gut-metabolic triggers, immune-neurodegeneration, and interventions," Graefe’s Archive for Clinical and Experimental Ophthalmology, vol. 263, no. 10, pp. 2723–2736, Oct. 2025. DOI: https://doi.org/10.1007/s00417-025-06906-6

R. Candido et al., "Retrospective cohort study on treatment outcomes of early vs late onset gestational diabetes mellitus," Acta Diabetologica, vol. 62, no. 6, pp. 881–889, Nov. 2024. DOI: https://doi.org/10.1007/s00592-024-02405-y

S. S. Bhat, M. Banu, G. A. Ansari, and V. Selvam, "A risk assessment and prediction framework for diabetes mellitus using machine learning algorithms," Healthcare Analytics, vol. 4, Dec. 2023, Art. no. 100273. DOI: https://doi.org/10.1016/j.health.2023.100273

M. Kumar, Sushant, and A. K. Yadav, "Speech signal’s phase information based Alzheimer’s disease detection using deep learning," International Journal of Speech Technology, vol. 28, no. 2, pp. 397–410, Jun. 2025. DOI: https://doi.org/10.1007/s10772-025-10193-1

L. E. S. E. Oliveira et al., "SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks," Journal of Biomedical Semantics, vol. 13, no. 1, Dec. 2022, Art. no. 13. DOI: https://doi.org/10.1186/s13326-022-00269-1

U. M. Butt, S. Letchmunan, M. Ali, F. H. Hassan, A. Baqir, and H. H. R. Sherazi, "Machine Learning Based Diabetes Classification and Prediction for Healthcare Applications," Journal of Healthcare Engineering, vol. 2021, pp. 1–17, Sep. 2021. DOI: https://doi.org/10.1155/2021/9930985

H. Kaur and V. Kumari, "Predictive modelling and analytics for diabetes using a machine learning approach," Applied Computing and Informatics, vol. 18, no. 1/2, pp. 90–100, Mar. 2022. DOI: https://doi.org/10.1016/j.aci.2018.12.004

K. Oliullah, M. H. Rasel, Md. M. Islam, Md. R. Islam, Md. A. H. Wadud, and Md. Whaiduzzaman, "A stacked ensemble machine learning approach for the prediction of diabetes," Journal of Diabetes & Metabolic Disorders, vol. 23, no. 1, pp. 603–617, Nov. 2023. DOI: https://doi.org/10.1007/s40200-023-01321-2

H. Zaky et al., "Machine learning based model for the early detection of Gestational Diabetes Mellitus," BMC Medical Informatics and Decision Making, vol. 25, no. 1, Mar. 2025, Art. no. 130. DOI: https://doi.org/10.1186/s12911-025-02947-3

M. A. Nematollahi et al., "A cohort study on the predictive capability of body composition for diabetes mellitus using machine learning," Journal of Diabetes & Metabolic Disorders, vol. 23, no. 1, pp. 773–781, Nov. 2023. DOI: https://doi.org/10.1007/s40200-023-01350-x

M. Mustafa, "Diabetes prediction dataset." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/iammustafatz/diabetes-prediction-dataset.

A. Thakur, T. Zhu, V. Abrol, J. Armstrong, Y. Wang, and D. A. Clifton, "Data encoding for healthcare data democratization and information leakage prevention," Nature Communications, vol. 15, no. 1, Feb. 2024, Art. no. 1582. DOI: https://doi.org/10.1038/s41467-024-45777-z

M. Rath and H. Date, "Quantum data encoding: a comparative analysis of classical-to-quantum mapping techniques and their impact on machine learning accuracy," EPJ Quantum Technology, vol. 11, no. 1, Dec. 2024, Art. no. 72. DOI: https://doi.org/10.1140/epjqt/s40507-024-00285-3

R. S. Selina, M. Rahardi, A. Aminuddin, F. F. Abdulloh, H. Badi, and B. P. Asaddulloh, "Optimizing Diabetes Diagnosis Using Machine Learning With SMOTE and Feature Selection," in 2025 International Conference on Computer Sciences, Engineering, and Technology Innovation (ICoCSETI), Jakarta, Indonesia, Jan. 2025, pp. 647–652. DOI: https://doi.org/10.1109/ICoCSETI63724.2025.11020043

M. Rahardi, A. Aminuddin, F. F. Abdulloh, B. P. Asaddulloh, H. R. Enriquez, and K. Kusnawi, "Analyzing the Impact of Data Resampling on Stroke Prediction using Machine Learning," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 20790–20797, Apr. 2025. DOI: https://doi.org/10.48084/etasr.9736

A. K. Salih, A. K. Faraj, M. A. Ahmed, and A. N. A. Al-Hasnawi, "The Impact of Data Splitting Strategy on Drilling Rate Prediction in the Rumaila Oil Field," Petroleum Chemistry, vol. 64, no. 7, pp. 781–786, Jul. 2024. DOI: https://doi.org/10.1134/S0965544124050025

H. Babaei, M. Zamani, and S. Mohammadi, "The impact of data splitting methods on machine learning models: A case study for predicting concrete workability," Machine Learning for Computational Science and Engineering, vol. 1, no. 1, Jun. 2025, Art. no. 21. DOI: https://doi.org/10.1007/s44379-025-00021-3

Y. Fan and P. Waldmann, "Tabular deep learning: a comparative study applied to multi-task genome-wide prediction," BMC Bioinformatics, vol. 25, no. 1, Oct. 2024, Art. no. 322. DOI: https://doi.org/10.1186/s12859-024-05940-1

S. Popov, S. Morozov, and A. Babenko, "Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data." arXiv, Sep. 19, 2019.

P. Wielopolski, O. Furman, and M. Zięba, "NodeFlow: Towards End-to-End Flexible Probabilistic Regression on Tabular Data," Entropy, vol. 26, no. 7, Jul. 2024, Art. no. 593. DOI: https://doi.org/10.3390/e26070593

T. Kumar and R. L. Ujjwal, "TabNet unveils predictive insights: a deep learning approach for Parkinson’s disease prognosis," International Journal of System Assurance Engineering and Management, Jul. 2024. DOI: https://doi.org/10.1007/s13198-024-02450-4

S. Yingze, S. Yingxu, Z. Xin, Z. Jie, and Y. Degang, "Comparative analysis of the TabNet algorithm and traditional machine learning algorithms for landslide susceptibility assessment in the Wanzhou Region of China," Natural Hazards, vol. 120, no. 8, pp. 7627–7652, Jun. 2024. DOI: https://doi.org/10.1007/s11069-024-06521-4

A. Rezaeezade and L. Batina, "Regularizers to the rescue: fighting overfitting in deep learning-based side-channel analysis," Journal of Cryptographic Engineering, vol. 14, no. 4, pp. 609–629, Nov. 2024. DOI: https://doi.org/10.1007/s13389-024-00361-5

C. Wang, X. Yu, C. Bai, Q. Zhang, and Z. Wang, "Ensemble successor representations for task generalization in offline-to-online reinforcement learning," Science China Information Sciences, vol. 67, no. 7, Jul. 2024, Art. no. 172203. DOI: https://doi.org/10.1007/s11432-023-4028-1

K. Akyol, E. Uçar, Ü. Atila, and M. Uçar, "An ensemble approach for classification of tympanic membrane conditions using soft voting classifier," Multimedia Tools and Applications, vol. 83, no. 32, pp. 77809–77830, Feb. 2024. DOI: https://doi.org/10.1007/s11042-024-18631-z

Downloads

How to Cite

[1]
M. Rahardi, F. F. Abdulloh, A. Masruro, B. P. Asaddulloh, A. Aminuddin, and N. Sholihah, “Ensemble Learning for Diabetes Prediction: An Integration of TabNet and Neural Oblivious Decision Ensembles (NODE)”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 6, pp. 30426–30431, Dec. 2025.

Metrics

Abstract Views: 230
PDF Downloads: 242

Metrics Information