Leveraging a Random Forest Classifier and SVMSMOTE for an Early-stage Dengue Prediction
Received: 1 March 2025 | Revised: 28 March 2025 | Accepted: 7 April 2025 | Online: 4 June 2025
Corresponding author: Yulianti Paula Bria
Abstract
Dengue is a significant global health issue, and in its severe form, it can be fatal. Accurate early-stage prediction tools are crucial in resource-limited areas to prevent dengue's progression to a severe state. This study aims to develop machine learning classifiers for dengue to assist medical personnel in differentiating the latter from other diseases, thereby helping with its earlier prognosis. Early-stage dengue classifiers were developed using medical records collected from two hospitals in East Nusa Tenggara Province, Indonesia. Eight machine learning techniques were leveraged to develop the classifiers, including Logistic Regression (LR), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), K-nearest Neighbors (KNN), Naïve Bayes (NB), Classification and Regression Tree (CART), Random Forest (RF), and extreme Gradient Boosting (XGBoost). To address the imbalance in the dataset, we utilized the SVM Synthetic Minority Oversampling Technique (SVMSMOTE) was utilized. The dataset was finalized through the expertise of 15 medical doctors and insights gathered from four Indonesian digital health platforms. The key findings of this study are: i) important features for early-stage dengue prediction include fever, duration of fever, headache, arthralgia, myalgia, nausea, shivering, loss of appetite, bitter mouth, temperature, and age, ii) machine learning techniques, including RF, NB, KNN, and XGBoost, were found to be suitable for dengue prediction, and iii) RF combined with SVMSMOTE, outperformed other techniques, achieving an accuracy of 94.99% and an F1-score of 85.65% for early-stage dengue prediction.
Keywords:
dengue fever, dengue classifier, dengue prediction, random forestDownloads
References
World Health Organization, "Dengue and severe dengue.".
World Health Organization, "Indonesia takes decisive, pioneering action to strengthen multisource collaborative surveillance for dengue,".
D. H. Somasetia, T. T. Malahayati, F. M. Andriyani, D. Setiabudi, and H. M. Nataprawira, "A fatal course of multiple inflammatory syndrome in children coinfection with dengue. A case report from Indonesia," IDCases, vol. 22, 2020, Art. no. e01002.
R. A. K. Tazkia, V. Narita, and A. S. Nugroho, "Dengue outbreak prediction for GIS based Early Warning System," in 2015 International Conference on Science in Information Technology (ICSITech), Yogyakarta, Oct. 2015, pp. 121–125.
M. Nabilah, R. Tyasnurita, F. Mahananto, W. Anggraeni, R. A. Vinarti, and A. Muklason, "Forecasting the number of dengue fever based on weather conditions using ensemble forecasting method," IAES International Journal of Artificial Intelligence (IJ-AI), vol. 12, no. 1, Mar. 2023, Art. no. 496.
M. F. F. Mardianto, S. H. Kartiko, and H. Utami, "The Fourier series estimator to predict the number of dengue and malaria sufferers in Indonesia," in International Conference on Mathematics, Computational Sciences and Statistics 2020, Surabaya, Indonesia, 2021, Art. no. 060002.
A. L. Ramadona, Y. Tozan, J. Wallin, L. Lazuardi, A. Utarini, and J. Rocklöv, "Predicting the dengue cluster outbreak dynamics in Yogyakarta, Indonesia: a modelling study," The Lancet Regional Health - Southeast Asia, vol. 15, Aug. 2023, Art. no. 100209.
T. H. F. Harumy, H. Y. Chan, and G. C. Sodhy, "Prediction for Dengue Fever in Indonesia Using Neural Network and Regression Method," Journal of Physics: Conference Series, vol. 1566, no. 1, Jun. 2020, Art. no. 012019.
M. Mistawati, Y. Yasnani, and H. Lestari, "Forecasting prevalence of dengue hemorrhagic fever using ARIMA model in Sulawesi Tenggara Province, Indonesia," Public Health of Indonesia, vol. 7, no. 2, pp. 75–86, Jun. 2021.
W. Anggraeni et al., "Modified Regression Approach for Predicting Number of Dengue Fever Incidents in Malang Indonesia," Procedia Computer Science, vol. 124, pp. 142–150, 2017.
M. A. Majeed, H. Z. M. Shafri, Z. Zulkafli, and A. Wayayok, "A Deep Learning Approach for Dengue Fever Prediction in Malaysia Using LSTM with Spatial Attention," International Journal of Environmental Research and Public Health, vol. 20, no. 5, Feb. 2023, Art. no. 4130.
S. K. Dey et al., "Prediction of dengue incidents using hospitalized patients, metrological and socio-economic data in Bangladesh: A machine learning approach," PLOS ONE, vol. 17, no. 7, Jul. 2022, Art. no. e0270933.
Mamenun, Y. Koesmaryono, A. Sopaheluwakan, R. Hidayati, B. D. Dasanto, and R. Aryati, "Spatiotemporal Characterization of Dengue Incidence and Its Correlation to Climate Parameters in Indonesia," Insects, vol. 15, no. 5, May 2024, Art. no. 366.
A. Aswi, S. Cramb, E. Duncan, W. Hu, G. White, and K. Mengersen, "Climate variability and dengue fever in Makassar, Indonesia: Bayesian spatio-temporal modelling," Spatial and Spatio-temporal Epidemiology, vol. 33, Jun. 2020, Art. no. 100335.
R. Gangula, L. Thirupathi, R. Parupati, K. Sreeveda, and S. Gattoju, "Ensemble machine learning based prediction of dengue disease with performance and accuracy elevation patterns," Materials Today: Proceedings, vol. 80, pp. 3458–3463, 2023.
G. Gupta et al., "DDPM: A Dengue Disease Prediction and Diagnosis Model Using Sentiment Analysis and Machine Learning Algorithms," Diagnostics, vol. 13, no. 6, Mar. 2023, Art. no. 1093.
B. Abdualgalil, S. Abraham, and W. M. Ismael, "Early Diagnosis for Dengue Disease Prediction Using Efficient Machine Learning Techniques Based on Clinical Data," Journal of Robotics and Control (JRC), vol. 3, no. 3, pp. 257–268, May 2022.
Y. P. Bria, "Determining Important Features for Dengue Diagnosis using Feature Selection Methods," Journal of Applied Data Sciences, vol. 6, no. 1, pp. 47–59, Jan. 2024.
Alodokter Ministry of Health of the Republic of Indonesia, "Demam Dengue.".
Halodoc, "Demam Berdarah." [Online]. Available: https://www.halodoc.com/kesehatan/demam-berdarah.
Klikdokter, "Demam Berdarah Dengue." [Online]. Available: https://www.klikdokter.com/penyakit/masalah-infeksi/demam-berdarah-dengue.
Ministry of Health of Indonesia, "Demam Berdarah Dengue.".
V. Ramasamy, S. Vadivel, S. Kothandapani, J. Mahilraj, P. Sivaram, and B. Sharma, "An Optimal Feature Selection with Neural Network-Based Classification Model for Dengue Fever Prediction," in 2023 6th International Conference on Information Systems and Computer Networks (ISCON), Mathura, India, Mar. 2023, pp. 1–5.
S. Matharaarachchi, M. Domaratzki, and S. Muthukumarana, "Minimizing features while maintaining performance in data classification problems," PeerJ Computer Science, vol. 8, Sep. 2022, Art. no. e1081.
Y. P. Bria, C.-H. Yeh, and S. Bedingfield, "Significant symptoms and nonsymptom-related factors for malaria diagnosis in endemic regions of Indonesia," International Journal of Infectious Diseases, vol. 103, pp. 194–200, Feb. 2021.
J. Brownlee, Imbalanced classification with Python: choose better metrics, balance skewed classes, and apply cost-sensitive learning, v1.2, 2020.
P. Silitonga, B. E. Dewi, A. Bustamam, and H. S. Al-Ash, "Evaluation of Dengue Model Performances Developed Using Artificial Neural Network and Random Forest Classifiers," Procedia Computer Science, vol. 179, pp. 135–143, 2021.
C. Y. Santos et al., "A machine learning model to assess potential misdiagnosed dengue hospitalization," Heliyon, vol. 9, no. 6, Jun. 2023, Art. no. e16634.
R. Rastogi, M. Bansal, N. Kumar, S. Singla, P. Singla, and R. A. Jaswal, "Effective Diabetes Prediction using an IoT-based Integrated Ensemble Machine Learning Framework," Engineering, Technology & Applied Science Research, vol. 15, no. 1, pp. 20064–20070, Feb. 2025.
A. D’Abramo et al., "A machine learning approach for early identification of patients with severe imported malaria," Malaria Journal, vol. 23, no. 1, Feb. 2024, Art. no. 46.
F. Özen, "Random forest regression for prediction of Covid-19 daily cases and deaths in Turkey," Heliyon, vol. 10, no. 4, Feb. 2024, Art. no. e25746.
S. Haryanto et al., "Clinical features and virological confirmation of perinatal dengue infection in Jambi, Indonesia: A case report," International Journal of Infectious Diseases, vol. 86, pp. 197–200, Sep. 2019.
R. T. Sasmono et al., "Molecular epidemiology of dengue in North Kalimantan, a province with the highest incidence rates in Indonesia in 2019," Infection, Genetics and Evolution, vol. 95, Nov. 2021, Art. no. 105036.
Downloads
How to Cite
License
Copyright (c) 2025 Yulianti Paula Bria, Paskalis Andrianus Nani, Yovinia Carmeneja Hoar Siki, Natalia Magdalena Rafu Mamulak, Emiliana Metan Meolbatak, Robertus Dole Guntur

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.