Utilizing Machine Learning and Explainable AI for Assessing Income Poverty Risk: Evidence from EU-SILC 2023 in Slovakia

Silvia Komara; Marian Cvirik; Martina Kosikova; Michal Páleš

doi:10.48084/etasr.16958

Authors

Silvia Komara Department of Statistics, Faculty of Economic Informatics, Bratislava University of Economics and Business, Slovakia
Marian Cvirik Research Institute of Trade and Sustainable Business, Faculty of Commerce, Bratislava University of Economics and Business, Slovakia
Martina Kosikova Department of Statistics, Faculty of Economic Informatics, Bratislava University of Economics and Business, Slovakia
Michal Páleš Department of Mathematics and Actuarial Science, Faculty of Economic Informatics, Bratislava University of Economics and Business, Slovakia

Volume: 16 | Issue: 2 | Pages: 34219-34225 | April 2026 | https://doi.org/10.48084/etasr.16958

Received: 15 December 2025 | Revised: 14 January 2026 | Accepted: 19 January 2026 | Online: 4 April 2026

Corresponding author: Silvia Komara

Abstract

Machine Learning (ML) methods, driven by advances in computational power, have become indispensable tools in contemporary economic research. Unlike traditional statistical models that primarily emphasize inference and hypothesis testing, ML techniques prioritize forecasting performance and the identification of complex nonlinear relationships among variables. However, many high-performing ML algorithms, particularly ensemble and deep learning models, operate as "black boxes", rendering the contribution of individual predictors difficult to interpret. This lack of transparency raises concerns about interpretability, which are key aspects in policy-oriented analyses. Therefore, the integration of Explainable Artificial Intelligence (XAI) techniques has become crucial for bridging the gap between predictive accuracy and meaningful understanding. In this study, we assess and predict income poverty risk in Slovakia using microdata from the European Union Statistics on Income and Living Conditions (EU-SILC) dataset for 2023. To achieve that, we compare the performance of Logistic Regression (LR), Random Forest (RF), and Extreme Gradient Boosting (XGBoost), while for the model transparency, the Shapley Additive Explanations (SHAP) values are employed. This framework enables the development of models that maintain strong predictive performance while providing clear, policy-relevant insights into the underlying drivers of At-Risk-of-Poverty (AROP).

Keywords:

At-Risk-of-Poverty (AROP), machine learning, feature importances, Shapley Additive Explanations (SHAP)

References

A. Sen, Development as Freedom. Oxford , New York, Oxford University Press, 2001.

B. Nolan and C. T. Whelan, Poverty and Deprivation in Europe. Oxford University Press, 2011. DOI: https://doi.org/10.1093/acprof:oso/9780199588435.001.0001

S. Alkire and J. Foster, "Counting and multidimensional poverty measurement," Journal of Public Economics, vol. 95, no. 7–8, pp. 476–487, Aug. 2011. DOI: https://doi.org/10.1016/j.jpubeco.2010.11.006

S. Alkire and J. Foster, "Understandings and misunderstandings of multidimensional poverty measurement," The Journal of Economic Inequality, vol. 9, no. 2, pp. 289–314, June 2011. DOI: https://doi.org/10.1007/s10888-011-9181-4

Eurostat. "Glossary:At risk of poverty or social exclusion (AROPE)." Eurostat Statistics Explained. [Online]. Available: https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Glossary:At_risk_of_poverty_or_social_exclusion_(AROPE).

E. Šoltés, S. Komara, and T. Šoltésová, "Exploration of poverty and social exclusion of Slovak population via contrast analysis associated with logit models," Quality & Quantity, vol. 57, no. 6, pp. 5079–5105, Dec. 2023. DOI: https://doi.org/10.1007/s11135-022-01573-9

H. Dudek and J. Landmesser, "Inability to face unexpected expenses and monetary poverty in Poland: Are these two faces on the same coin?," Equilibrium. Quarterly Journal of Economics and Economic Policy, vol. 19, no. 4, pp. 1305–1325, Jan. 2025. DOI: https://doi.org/10.24136/eq.3049

M. Martí and C. Ródenas, "Poor and satisfied? A review of the monetary poverty indicator in the EU," Journal of Poverty and Social Justice, vol. 32, no. 1, pp. 100–128, Feb. 2024. DOI: https://doi.org/10.1332/17598273Y2023D000000003

E. Šoltés, M. Gawrycka, M. Reiff, and T. Šoltésová, "Identification of vulnerable population groups: a comparative analysis of the proportion of V4 countries’ population facing severe material and social deprivation," Quality & Quantity, Dec. 2025. DOI: https://doi.org/10.1007/s11135-025-02507-x

H. Dudek and W. Szczesny, "Multidimensional material deprivation in Poland: a focus on changes in 2015–2017," Quality & Quantity, vol. 55, no. 2, pp. 741–763, Apr. 2021. DOI: https://doi.org/10.1007/s11135-020-01024-3

P. Ulman and M. Ćwiek, "Measuring housing poverty in Poland: a multidimensional analysis," Housing Studies, vol. 36, no. 8, pp. 1212–1230, Sept. 2021. DOI: https://doi.org/10.1080/02673037.2020.1759515

G. Betti, F. Gagliardi, A. Lemmi, and V. Verma, "Comparative measures of multidimensional deprivation in the European Union," Empirical Economics, vol. 49, no. 3, pp. 1071–1100, Nov. 2015. DOI: https://doi.org/10.1007/s00181-014-0904-9

M. Ćwiek, P. Ulman, and M. Sadko, "Evaluation of housing conditions in Europe using the TOPSIS method," Ekonomista, pp. 1–24, Sept. 2024. DOI: https://doi.org/10.52335/ekon/189393

C. Molnar, Interpretable machine learning: a guide for making black box models explainable, Second edition. Munich, Germany: Christoph Molnar, 2022.

M. T. Ribeiro, S. Singh, and C. Guestrin, "‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco California USA, Aug. 2016, pp. 1135–1144. DOI: https://doi.org/10.1145/2939672.2939778

S. M. Lundberg et al., "From local explanations to global understanding with explainable AI for trees," Nature Machine Intelligence, vol. 2, no. 1, pp. 56–67, Jan. 2020. DOI: https://doi.org/10.1038/s42256-019-0138-9

M. Li, H. Sun, Y. Huang, and H. Chen, "Shapley value: from cooperative game to explainable artificial intelligence," Autonomous Intelligent Systems, vol. 4, no. 1, Feb. 2024, Art. no. 2. DOI: https://doi.org/10.1007/s43684-023-00060-8

B. K. Raghupathy, M. R. Reddy, Prasad Theeda, E. Balasubramanian, R. K. Namachivayam, and M. Ganesan, "Harnessing Explainable Artificial Intelligence (XAI) based SHAPLEY Values and Ensemble Techniques for Accurate Alzheimer’s Disease Diagnosis," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 20743–20747, Apr. 2025. DOI: https://doi.org/10.48084/etasr.9619

A. C. Lyons, A. Montoya Castano, J. Kass-Hanna, Y. Zhang, and A. Soliman, "A machine learning approach to assessing multidimensional poverty and targeting assistance among forcibly displaced populations," World Development, vol. 192, Aug. 2025, Art. no. 107013. DOI: https://doi.org/10.1016/j.worlddev.2025.107013

Eurostat, "EU Statistics on Income and Living Conditions microdata, 2024 release, data covering years 2004-2023." Eurostat, 2024.

H. Wirth and K. Pforr, "The European Union Statistics on Income and Living Conditions after 15 Years," European Sociological Review, vol. 38, no. 5, pp. 832–848, Nov. 2022. DOI: https://doi.org/10.1093/esr/jcac024

M. Affenzeller et al., "White Box vs. Black Box Modeling: On the Performance of Deep Learning, Random Forests, and Symbolic Regression in Solving Regression Problems," in Computer Aided Systems Theory – EUROCAST 2019, Cham, 2020, vol. 12013, pp. 288–295. DOI: https://doi.org/10.1007/978-3-030-45093-9_35

A. Agresti, Categorical data analysis, 3rd ed. Hoboken, NJ: Wiley-Interscience, 2013.

L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001. DOI: https://doi.org/10.1023/A:1010933404324

C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, "A comparative analysis of gradient boosting algorithms," Artificial Intelligence Review, vol. 54, no. 3, pp. 1937–1967, Mar. 2021. DOI: https://doi.org/10.1007/s10462-020-09896-5

J. H. Friedman, "Greedy function approximation: A gradient boosting machine.," The Annals of Statistics, vol. 29, no. 5, Oct. 2001. DOI: https://doi.org/10.1214/aos/1013203451

D. M. W. Powers, "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation," Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, June 2002. DOI: https://doi.org/10.1613/jair.953

V. López, A. Fernández, S. García, V. Palade, and F. Herrera, "An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics," Information Sciences, vol. 250, pp. 113–141, Nov. 2013. DOI: https://doi.org/10.1016/j.ins.2013.07.007

V. Werner de Vargas, J. A. S. Aranda, R. Dos Santos Costa, P. R. Da Silva Pereira, and J. L. Victória Barbosa, "Imbalanced data preprocessing techniques for machine learning: a systematic mapping study," Knowledge and Information Systems, vol. 65, no. 1, pp. 31–57, Jan. 2023. DOI: https://doi.org/10.1007/s10115-022-01772-8