Explainable Artificial Intelligence with Hybrid Ensemble Learning-Based Automated Code Comprehension Prediction
Received: 3 April 2026 | Revised: 11 May 2026, 21 May 2026, and 25 May 2026 | Accepted: 26 May 2026 | Online: 7 June 2026
Corresponding author: Bharat Babaso Mane
Abstract
Code comprehension prediction is an interesting research area in software engineering, which employs Artificial Intelligence (AI) and Machine Learning (ML) algorithms to assess how easily a programmer can understand a piece of code. To ensure precise classification models, the preceding analysis mainly relies on handcrafted features. However, manual feature engineering is labor-intensive and can acquire only partial information about the source code, limiting model performance. Recently, many Deep Learning (DL)–based code readability classification approaches have been presented. This paper presents an Explainable Code Readability Classification using Vector Representations and Majority Voting-Based Ensemble Learning (ECRVR-MVEL) approach. The model initially preprocesses the input code and then uses CodeBERT to transform it into vector representations. For classification, a Weighted Majority Voting Ensemble (WMVE) integrates a Graph Convolutional Network (GCN), a Deep Belief Network (DBN), and a Bidirectional Temporal Convolutional Network (Bi-TCN). In addition, Nadam is applied to optimize the model and improve performance. Finally, Local Interpretable Model-agnostic Explanations (LIME) is utilized to visualize the interpretability and ensure transparency. An extensive evaluation to determine the performance of the ECRVR-MVEL model on Python and C++ datasets demonstrates its promising results over existing methods.
Keywords:
software quality, explainable artificial intelligence, deep learning, code comprehension, hyperparameter tuning, ensemble classificationReferences
[1] J. Johnson, S. Lubo, N. Yedla, J. Aponte, and B. Sharif, "An Empirical Study Assessing Source Code Readability in Comprehension," in 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), Sept. 2019, pp. 513–523.
[2] R. P. L. Buse and W. R. Weimer, "Learning a Metric for Code Readability," IEEE Transactions on Software Engineering, vol. 36, no. 4, pp. 546–558, July 2010.
[3] T. Kanoutas, T. Karanikiotis, and A. L. Symeonidis, "Enhancing Code Readability through Automated Consistent Formatting," Electronics, vol. 13, no. 11, May 2024, Art. no. 2073.
[4] D. Tosi, "Studying the Quality of Source Code Generated by Different AI Generative Engines: An Empirical Evaluation," Future Internet, vol. 16, no. 6, May 2024, Art. no. 188.
[5] A. Vitale, E. Guglielmi, R. Oliveto, and S. Scalabrino, "Personalized Code Readability Assessment: Are We There Yet?" arXiv, 2025.
[6] D. Álvarez-Fidalgo and F. Ortin, "CLAVE: A deep learning model for source code authorship verification with contrastive learning and transformer encoders," Information Processing & Management, vol. 62, no. 3, May 2025, Art. no. 104005.
[7] B. Tadesse, V. Nitin, M. Salah, B. Ray, M. d’Amorim, and W. Assunção, "Code Quality Analysis of Translations from C to Rust." arXiv, 2026.
[8] H. Fawareh, H. M. Al-Shdaifat, A.-R. M, F. A. Fawareh, and M. Khouj, "Investigates the Impact of AI-generated Code Tools on Software Readability Code Quality Factor," in 2024 25th International Arab Conference on Information Technology (ACIT), Dec. 2024, pp. 1–5.
[9] W. Tang, M. Tang, M. Ban, Z. Zhao, and M. Feng, "CSGVD: A deep learning approach combining sequence and graph embedding for source code vulnerability detection," Journal of Systems and Software, vol. 199, May 2023, Art. no. 111623.
[10] M. Duijn, A. Kucera, and A. Bacchelli, "Quality Questions Need Quality Code: Classifying Code Fragments on Stack Overflow," in 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, May 2015, pp. 410–413.
[11] Y. Tashtoush, N. Abu-El-Rub, O. Darwish, S. Al-Eidi, D. Darweesh, and O. Karajeh, "A Notional Understanding of the Relationship between Code Readability and Software Complexity," Information, vol. 14, no. 2, Jan. 2023.
[12] Z. Feng et al., "CodeBERT: A Pre-Trained Model for Programming and Natural Languages," in Findings of the Association for Computational Linguistics: EMNLP 2020, Aug. 2020, pp. 1536–1547.
[13] T. N. Kipf and M. Welling, "Semi-Supervised Classification with Graph Convolutional Networks." arXiv, 2016.
[14] T. Dozat, "Incorporating Nesterov Momentum into Adam," in Proceedings of the 4th International Conference on Learning Representations, 2016, pp. 1–4.
[15] M. T. Ribeiro, S. Singh, and C. Guestrin, "‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, May 2016, pp. 1135–1144.
[16] "Code Snippets: Insights and Readability." [Online]. Available: https://www.kaggle.com/datasets/paakhim10/code-snippets-insights-and-readability.
Downloads
How to Cite
License
Copyright (c) 2026 Bharat Babaso Mane, Rathnakar Achary

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
