A Decision Tree-Based Cloud Replication Model for Enhanced Data Management

Authors

  • Aws I. Abueid Faculty of Computing Studies, Arab Open University, Kuwait
Volume: 15 | Issue: 6 | Pages: 29580-29589 | December 2025 | https://doi.org/10.48084/etasr.11170

Abstract

This paper introduces the Decision Tree Cloud Replication (DTCR) model, a novel Artificial Intelligence (AI)-based approach for managing data replication in cloud environments. The model is designed to enhance data availability, optimize performance, and reduce resource costs by intelligently deciding when to add or remove replicas. Unlike traditional static replication strategies, DTCR adapts dynamically to system conditions, response time targets, and tenant budget constraints. The methodology involves using supervised learning via Decision Tree (DT) algorithms trained on synthetic datasets generated using the Markov Chain Monte Carlo (MCMC) method to simulate realistic replication scenarios. Hyperparameter tuning is performed through grid search to determine optimal settings, such as maximum tree depth and minimum samples per split, whereas cross-validation ensures a reliable evaluation process. The implementation is carried out using the Weka platform, and model performance is assessed using multiple metrics, including accuracy, precision, recall, F-measure, and the Matthews Correlation Coefficient (MCC). The results indicate that the DTCR model achieves a classification accuracy of 100%, with balanced performance across both target classes, demonstrating its effectiveness in real-time decision-making for replica placement and removal. Further discussion shows that the model generalizes well to unseen data, avoids overfitting through optimal depth control, and maintains high availability with minimal overhead. This confirms the model's potential for integration into cloud scheduling policies, offering practical benefits in fault tolerance, latency reduction, and cost efficiency. The contribution of this work lies in presenting an intelligent, scalable, and cost-aware solution for replication management, which outperforms conventional approaches and addresses critical challenges in cloud-based data systems.

Keywords:

Artificial Intelligence (AI), cloud environment, data management, data replication, DTCR model

Downloads

Download data is not yet available.

References

M. De Donno, K. Tange, and N. Dragoni, "Foundations and Evolution of Modern Computing Paradigms: Cloud, IoT, Edge, and Fog," IEEE Access, vol. 7, pp. 150936–150948, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2947652

A. Sunyaev, "Cloud Computing," in Internet Computing: Principles of Distributed Systems and Emerging Internet-Based Technologies, Cham, Switzerland: Springer Nature, 2024, pp. 165–209. DOI: https://doi.org/10.1007/978-3-031-61014-1_6

S. Singh and I. Chana, "A Survey on Resource Scheduling in Cloud Computing: Issues and Challenges," Journal of Grid Computing, vol. 14, no. 2, pp. 217–264, Jun. 2016. DOI: https://doi.org/10.1007/s10723-015-9359-2

L. Chen, M. Qiu, J. Song, Z. Xiong, and H. Hassan, "E2FS: an elastic storage system for cloud computing," The Journal of Supercomputing, vol. 74, no. 3, pp. 1045–1060, Mar. 2018. DOI: https://doi.org/10.1007/s11227-016-1827-3

P. Kumari and P. Kaur, "A survey of fault tolerance in cloud computing," Journal of King Saud University - Computer and Information Sciences, vol. 33, no. 10, pp. 1159–1176, Dec. 2021. DOI: https://doi.org/10.1016/j.jksuci.2018.09.021

A. Shakarami, M. Ghobaei-Arani, A. Shahidinejad, M. Masdari, and H. Shakarami, "Data replication schemes in cloud computing: a survey," Cluster Computing, vol. 24, no. 3, pp. 2545–2579, Sep. 2021. DOI: https://doi.org/10.1007/s10586-021-03283-7

G. K. F. Tso and K. K. W. Yau, "Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks," Energy, vol. 32, no. 9, pp. 1761–1768, Sep. 2007. DOI: https://doi.org/10.1016/j.energy.2006.11.010

C. Li, M. Song, M. Zhang, and Y. Luo, "Effective replica management for improving reliability and availability in edge-cloud computing environment," Journal of Parallel and Distributed Computing, vol. 143, pp. 107–128, Sep. 2020. DOI: https://doi.org/10.1016/j.jpdc.2020.04.012

E. B. Edwin, P. Umamaheswari, and M. R. Thanka, "An efficient and improved multi-objective optimized replication management with dynamic and cost aware strategies in cloud computing data center," Cluster Computing, vol. 22, no. 5, pp. 11119–11128, Sep. 2019. DOI: https://doi.org/10.1007/s10586-017-1313-6

S. Nannai John and T. T. Mirnalinee, "A novel dynamic data replication strategy to improve access efficiency of cloud storage," Information Systems and e-Business Management, vol. 18, no. 3, pp. 405–426, Sep. 2020. DOI: https://doi.org/10.1007/s10257-019-00422-x

D. Boru, D. Kliazovich, F. Granelli, P. Bouvry, and A. Y. Zomaya, "Energy-efficient data replication in cloud computing datacenters," Cluster Computing, vol. 18, no. 1, pp. 385–402, Mar. 2015. DOI: https://doi.org/10.1007/s10586-014-0404-x

R. Bagai, "Comparative Analysis of AWS Model Deployment Services," International Journal of Computer Trends and Technology, vol. 72, no. 5, pp. 102–110, May 2024. DOI: https://doi.org/10.14445/22312803/IJCTT-V72I5P113

P. Dutta and P. Dutta, "Comparative Study of Cloud Services Offered by Amazon, Microsoft & Google," International Journal of Trend in Scientific Research and Development, vol. 3, no. 3, pp. 981–985, Apr. 2019. DOI: https://doi.org/10.31142/ijtsrd23170

N. Sharma and S. Sagar, "Diabetes Prediction Using Machine Learning Algorithms," in 2024 1st International Conference on Advances in Computing, Communication and Networking, Greater Noida, India, 2024, pp. 449–457. DOI: https://doi.org/10.1109/ICAC2N63387.2024.10894772

B. Charbuty and A. Abdulazeez, "Classification Based on Decision Tree Algorithm for Machine Learning," Journal of Applied Science and Technology Trends, vol. 2, no. 1, pp. 20–28, Mar. 2021. DOI: https://doi.org/10.38094/jastt20165

D. Boru, D. Kliazovich, F. Granelli, P. Bouvry, and A. Y. Zomaya, "Models for efficient data replication in cloud computing datacenters," in 2015 IEEE International Conference on Communications, London, UK, 2015, pp. 6056–6061. DOI: https://doi.org/10.1109/ICC.2015.7249287

Y. Song and Y. Lu, "Decision tree methods: applications for classification and prediction," Shanghai Archives of Psychiatry, vol. 27, no. 2, pp. 130–135, Apr. 2015.

L. Rokach and O. Maimon, "Top-down induction of decision trees classifiers - a survey," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 35, no. 4, pp. 476–487, Nov. 2005. DOI: https://doi.org/10.1109/TSMCC.2004.843247

W.-Y. Loh, "Fifty Years of Classification and Regression Trees," International Statistical Review, vol. 82, no. 3, pp. 329–348, Dec. 2014. DOI: https://doi.org/10.1111/insr.12016

C.-L. Lin and C.-L. Fan, "Evaluation of CART, CHAID, and QUEST algorithms: a case study of construction defects in Taiwan," Journal of Asian Architecture and Building Engineering, vol. 18, no. 6, pp. 539–553, Nov. 2019. DOI: https://doi.org/10.1080/13467581.2019.1696203

S. R. Jiao, J. Song, and B. Liu, "A Review of Decision Tree Classification Algorithms for Continuous Variables," Journal of Physics: Conference Series, vol. 1651, no. 1, Nov. 2020, Art. no. 012083. DOI: https://doi.org/10.1088/1742-6596/1651/1/012083

Priyanka and D. Kumar, "Decision tree classifier: a detailed survey," International Journal of Information and Decision Sciences, vol. 12, no. 3, pp. 246–269, Jan. 2020. DOI: https://doi.org/10.1504/IJIDS.2020.108141

S. Singh and M. Giri, "Comparative Study Id3, Cart And C4.5 Decision Tree Algorithm: A Survey," International Journal of Advanced Information Science and Technology, vol. 3, no. 7, pp. 47–52, Jul. 2014.

C. E. Brodley and P. E. Utgoff, "Multivariate decision trees," Machine Learning, vol. 19, no. 1, pp. 45–77, Apr. 1995. DOI: https://doi.org/10.1023/A:1022607123649

Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, pp. 436–444, May 2015. DOI: https://doi.org/10.1038/nature14539

A. Hannan and J. Anmala, "Classification and Prediction of Fecal Coliform in Stream Waters Using Decision Trees (DTs) for Upper Green River Watershed, Kentucky, USA," Water, vol. 13, no. 19, Oct. 2021, Art. no. 2790. DOI: https://doi.org/10.3390/w13192790

A. C. M. da Silveira, Á. Sobrinho, L. D. da Silva, E. de B. Costa, M. E. Pinheiro, and A. Perkusich, "Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets," Applied Sciences, vol. 12, no. 7, Apr. 2022, Art. no. 3673. DOI: https://doi.org/10.3390/app12073673

D. Phiri, M. Simwanda, V. Nyirenda, Y. Murayama, and M. Ranagalage, "Decision Tree Algorithms for Developing Rulesets for Object-Based Land Cover Classification," ISPRS International Journal of Geo-Information, vol. 9, no. 5, May 2020, Art. no. 329. DOI: https://doi.org/10.3390/ijgi9050329

G. James, D. Witten, T. Hastie, and R. Tibshirani, "Tree-Based Methods," in An Introduction to Statistical Learning: with Applications in R, 2nd ed., New York, NY, USA: Springer US, 2021, pp. 327–365. DOI: https://doi.org/10.1007/978-1-0716-1418-1_8

N. Mansouri and M. M. Javidi, "A new Prefetching-aware Data Replication to decrease access latency in cloud environment," Journal of Systems and Software, vol. 144, pp. 197–215, Oct. 2018. DOI: https://doi.org/10.1016/j.jss.2018.05.027

L. Cui, J. Zhang, L. Yue, Y. Shi, H. Li, and D. Yuan, "A Genetic Algorithm Based Data Replica Placement Strategy for Scientific Applications in Clouds," IEEE Transactions on Services Computing, vol. 11, no. 4, pp. 727–739, Jul. 2018. DOI: https://doi.org/10.1109/TSC.2015.2481421

A. I. A. Eid, W. Awang, Mzarina, and A. Zakaria, "Replication Strategies based on Markov Chain Monte Carlo and Optimization on Cloud Applications," Journal of Theoretical and Applied Information Technology, vol. 98, no. 3, pp. 517–534, Feb. 2020.

B. Alshawi, "Utilizing GANs for Credit Card Fraud Detection: A Comparison of Supervised Learning Algorithms," Engineering, Technology & Applied Science Research, vol. 13, no. 6, pp. 12264–12270, Dec. 2023. DOI: https://doi.org/10.48084/etasr.6434

B. K. Ponukumati, P. Sinha, M. K. Maharana, A. V. P. Kumar, and A. Karthik, "An Intelligent Fault Detection and Classification Scheme for Distribution Lines Using Machine Learning," Engineering, Technology & Applied Science Research, vol. 12, no. 4, pp. 8972–8977, Aug. 2022. DOI: https://doi.org/10.48084/etasr.5107

A. Satty, M. M. Y. Salih, A. A. Hassaballa, E. A. E. Gumma, A. Abdallah, and G. S. M. Khamis, "Comparative Analysis of Machine Learning Algorithms for Investigating Myocardial Infarction Complications," Engineering, Technology & Applied Science Research, vol. 14, no. 1, pp. 12775–12779, Feb. 2024. DOI: https://doi.org/10.48084/etasr.6691

Downloads

How to Cite

[1]
A. I. Abueid, “A Decision Tree-Based Cloud Replication Model for Enhanced Data Management”, Eng. Technol. Appl. Sci. Res., vol. 15, no. 6, pp. 29580–29589, Dec. 2025.

Metrics

Abstract Views: 236
PDF Downloads: 167

Metrics Information