A Review of Question-Answering Systems Using Deep Learning in the Arabic Language

Ali Aloqla; Reda Khalifa; Wajdi Alghamdi

doi:10.48084/etasr.14229

Authors

Ali Aloqla Computer Information Systems Department, King Abdulaziz University, Jeddah, Saudi Arabia
Reda Khalifa Computer Information Technology Department, King Abdulaziz University, Jeddah, Saudi Arabia
Wajdi Alghamdi Computer Information Technology Department, King Abdulaziz University, Jeddah, Saudi Arabia

Volume: 15 | Issue: 6 | Pages: 29214-29228 | December 2025 | https://doi.org/10.48084/etasr.14229

Received: 21 August 2025 | Revised: 17 September 2025 and 20 September 2025 | Accepted: 24 September 2025 | Online: 8 December 2025

Corresponding author: Ali Aloqla

Abstract

Question-Answering (QA) has become a pivotal topic in Natural Language Processing (NLP), facilitating machines' comprehension and response to human inquiries in natural language. Although QA systems for English and other high-resource languages have been extensively studied, Arabic QA remains under-investigated and faces several linguistic and technical challenges. This paper offers an extensive analysis of deep learning-based Arabic QA systems, emphasizing extractive, generative, and hybrid architectures. This study analyzes the fundamental issues in Arabic processing, outlines essential datasets, and provides a classification of QA methodologies. Furthermore, it identifies several research gaps, including the absence of domain-specific models, limited generative question answering, and insufficient use of retrieval-augmented architectures. To overcome these deficiencies, a Fatwa-based dataset, currently under development, can serve as a resource for future research on domain-specific Arabic QA. This study also delineates prospective trajectories, emphasizing the promise of Retrieval-Augmented Generation (RAG), few-shot learning, and dialect-aware models in propelling the discipline forward.

Keywords:

Arabic NLP, QA, deep learning, RAG, natural language understanding, transformer

References

D. Jurafsky and J. H. Martin, Speech and Language Processing. 2025.

L. Hirschman and R. Gaizauskas, "Natural language question answering: the view from here," Natural Language Engineering, vol. 7, no. 4, pp. 275–300, Dec. 2001. DOI: https://doi.org/10.1017/S1351324901002807

E. M. Voorhees and D. M. Tice, "The TREC-8 Question Answering Track," in Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, Feb. 2000.

J. H. Martin and D. Jurafsky, Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition, vol. 23. 2009.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, Mar. 2019, pp. 4171–4186. DOI: https://doi.org/10.18653/v1/N19-1423

T. Brown et al., "Language Models are Few-Shot Learners," Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.

C. Raffel et al., "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer," Journal of Machine Learning Research, vol. 21, no. 140, 2020.

X. Li and D. Roth, "Learning Question Classifiers," in COLING 2002: The 19th International Conference on Computational Linguistics, 2002. DOI: https://doi.org/10.3115/1072228.1072378

E. M. Voorhees and D. M. Tice, "The TREC-8 Question Answering Track Evaluation," NIST, vol. 3, May 2000. DOI: https://doi.org/10.6028/NIST.SP.500-246.qa-overview

S. Badugu and R. Manivannan, "A study on different closed domain question answering approaches," International Journal of Speech Technology, vol. 23, no. 2, pp. 315–325, Jun. 2020. DOI: https://doi.org/10.1007/s10772-020-09692-0

A. Soudi, G. Neumann, and A. van den Bosch, "Arabic Computational Morphology: Knowledge-based and Empirical Methods," in Arabic Computational Morphology: Knowledge-based and Empirical Methods, A. Soudi, A. van den Bosch, and G. Neumann, Eds. Springer Netherlands, 2007, pp. 3–14. DOI: https://doi.org/10.1007/978-1-4020-6046-5_1

Z. Abbasiantaeb and S. Momtazi, "Text-based Question Answering from Information Retrieval and Deep Neural Network Perspectives: A Survey." arXiv, May 27, 2020. DOI: https://doi.org/10.1002/widm.1412

H. A. Pandya and B. S. Bhatt, "Question Answering Survey: Directions, Challenges, Datasets, Evaluation Matrices." arXiv, Dec. 07, 2021. DOI: https://doi.org/10.20944/preprints202112.0136.v1

E. M. Bender, "On achieving and evaluating language-independence in NLP," Linguistic Issues in Language Technology, vol. 6, 2011. DOI: https://doi.org/10.33011/lilt.v6i.1239

P. Joshi, S. Santy, A. Budhiraja, K. Bali, and M. Choudhury, "The State and Fate of Linguistic Diversity and Inclusion in the NLP World," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, Apr. 2020, pp. 6282–6293. DOI: https://doi.org/10.18653/v1/2020.acl-main.560

N. Y. Habash, Introduction to Arabic Natural Language Processing. Morgan & Claypool Publishers, 2010. DOI: https://doi.org/10.1007/978-3-031-02139-8

H. Mozannar, E. Maamary, K. El Hajal, and H. Hajj, "Neural Arabic Question Answering," in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, Dec. 2019, pp. 108–118. DOI: https://doi.org/10.18653/v1/W19-4612

R. Malhas, W. Mansour, and T. Elsayed, "Qur’an QA 2023 Shared Task: Overview of Passage Retrieval and Reading Comprehension Tasks over the Holy Qur’an," 2023.

R. Malhas, W. Mansour, and T. Elsayed, "Qur’an QA 2023 Shared Task: Overview of Passage Retrieval and Reading Comprehension Tasks over the Holy Qur’an," in Proceedings of the The First Arabic Natural Language Processing Conference (ArabicNLP 2023), 2023, pp. 690–701. DOI: https://doi.org/10.18653/v1/2023.arabicnlp-1.76

W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based Model for Arabic Language Understanding." arXiv, Mar. 07, 2021.

A. Abdelali, S. Hassan, H. Mubarak, K. Darwish, and Y. Samih, "Pre-Training BERT on Arabic Tweets: Practical Considerations." arXiv, Feb. 21, 2021.

E. M. B. Nagoudi, A. Elmadany, and M. Abdul-Mageed, "AraT5: Text-to-Text Transformers for Arabic Language Generation," in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, Feb. 2022, pp. 628–647. DOI: https://doi.org/10.18653/v1/2022.acl-long.47

O. Kolomiyets and M. F. Moens, "A survey on question answering technology from an information retrieval perspective," Information Sciences, vol. 181, no. 24, pp. 5412–5434, Dec. 2011. DOI: https://doi.org/10.1016/j.ins.2011.07.047

A. Rogers, M. Gardner, and I. Augenstein, "QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension," ACM Computing Surveys, vol. 55, no. 10, Oct. 2023, Art. no. 197. DOI: https://doi.org/10.1145/3560260

P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, "SQuAD: 100,000+ Questions for Machine Comprehension of Text," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 2016, pp. 2383–2392. DOI: https://doi.org/10.18653/v1/D16-1264

E. Dai et al., "A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability," Machine Intelligence Research, vol. 21, no. 6, pp. 1011–1061, Dec. 2024. DOI: https://doi.org/10.1007/s11633-024-1510-8

Z. Yang et al., "HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering," in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, Oct. 2018, pp. 2369–2380. DOI: https://doi.org/10.18653/v1/D18-1259

J. Welbl, P. Stenetorp, and S. Riedel, "Constructing Datasets for Multi-hop Reading Comprehension Across Documents," Transactions of the Association for Computational Linguistics, vol. 6, pp. 287–302, May 2018. DOI: https://doi.org/10.1162/tacl_a_00021

P. Lewis et al., "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." arXiv, Apr. 12, 2021.

D. Chen, A. Fisch, J. Weston, and A. Bordes, "Reading Wikipedia to Answer Open-Domain Questions." arXiv, Apr. 28, 2017. DOI: https://doi.org/10.18653/v1/P17-1171

D. Ferrucci et al., "Building Watson: An Overview of the DeepQA Project," AI Magazine, vol. 31, no. 3, pp. 59–79, Jul. 2010. DOI: https://doi.org/10.1609/aimag.v31i3.2303

D. Moldovan et al., "The structure and performance of an open-domain question answering system," in Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, USA, Jul. 2000, pp. 563–570. DOI: https://doi.org/10.3115/1075218.1075289

V. Karpukhin et al., "Dense Passage Retrieval for Open-Domain Question Answering," in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, Aug. 2020, pp. 6769–6781. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.550

M. Artetxe and H. Schwenk, "Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, Apr. 2019, pp. 3197–3203. DOI: https://doi.org/10.18653/v1/P19-1309

K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, "BLEU: a method for automatic evaluation of machine translation," in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02, Philadelphia, PA, USA, 2001. DOI: https://doi.org/10.3115/1073083.1073135

C. Y. Lin, "ROUGE: A Package for Automatic Evaluation of Summaries," in Text Summarization Branches Out, Barcelona, Spain, Apr. 2004, pp. 74–81.

C. W. Liu, R. Lowe, I. Serban, M. Noseworthy, L. Charlin, and J. Pineau, "How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 2016, pp. 2122–2132. DOI: https://doi.org/10.18653/v1/D16-1230

A. Chen, G. Stanovsky, S. Singh, and M. Gardner, "Evaluating Question Answering Evaluation," in Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Hong Kong, China, Aug. 2019, pp. 119–124. DOI: https://doi.org/10.18653/v1/D19-5817

E. Durmus, H. He, and M. Diab, "FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Apr. 2020, pp. 5055–5070. DOI: https://doi.org/10.18653/v1/2020.acl-main.454

S. Mazzucchi, N. Leone, S. Azzini, L. Pavesi, and V. Moretti, "Entropy certification of a realistic quantum random-number generator based on single-particle entanglement," Physical Review A, vol. 104, no. 2, Aug. 2021, Art. no. 022416. DOI: https://doi.org/10.1103/PhysRevA.104.022416

H. Bouamor et al., "The MADAR Arabic dialect corpus and lexicon," in Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018), 2018.

A. Rozovskaya, H. Bouamor, N. Habash, W. Zaghouani, O. Obeid, and B. Mohit, "The Second QALB Shared Task on Automatic Text Correction for Arabic," in Proceedings of the Second Workshop on Arabic Natural Language Processing, Beijing, China, 2015, pp. 26–35. DOI: https://doi.org/10.18653/v1/W15-3204

M. Nabil et al., "AlQuAnS – An Arabic Language Question Answering System:," in Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Madeira, Portugal, 2017, pp. 144–154. DOI: https://doi.org/10.5220/0006602901440154

L. Abouenour, K. Bouzoubaa, and P. Rosso, "On the extension of arabic wordnet named entities and its impact on question / answering," presented at the International Conference on Knowledge Engineering and Ontology Development, Oct. 2010, vol. 2, pp. 424–429. DOI: https://doi.org/10.5220/0003102004240429

E. Noguera, F. Llopis, and A. Ferrández, "Passage Filtering for Open-Domain Question Answering," in Advances in Natural Language Processing, 2006, pp. 534–540. DOI: https://doi.org/10.1007/11816508_53

T. H. Alwaneen, A. M. Azmi, H. A. Aboalsamh, E. Cambria, and A. Hussain, "Arabic question answering system: a survey," Artificial Intelligence Review, vol. 55, no. 1, pp. 207–253, Jan. 2022. DOI: https://doi.org/10.1007/s10462-021-10031-1

Y. Alkhurayyif and A. R. W. Sait, "A comprehensive survey of techniques for developing an Arabic question answering system," PeerJ Computer Science, vol. 9, Jun. 2023, Art. no. e1413. DOI: https://doi.org/10.7717/peerj-cs.1413

A. Bouziane, D. Bouchiha, N. Doumi, and M. Malki, "Question Answering Systems: Survey and Trends," Procedia Computer Science, vol. 73, pp. 366–375, Jan. 2015. DOI: https://doi.org/10.1016/j.procs.2015.12.005

A. Mishra and S. K. Jain, "A survey on question answering systems with classification," Journal of King Saud University - Computer and Information Sciences, vol. 28, no. 3, pp. 345–361, Jul. 2016. DOI: https://doi.org/10.1016/j.jksuci.2014.10.007

M. Essam, M. A. Deif, and R. Elgohary, "Deciphering Arabic question: a dedicated survey on Arabic question analysis methods, challenges, limitations and future pathways," Artificial Intelligence Review, vol. 57, no. 9, Aug. 2024, Art. no. 251. DOI: https://doi.org/10.1007/s10462-024-10880-6

H. M. Al Chalabi, S. K. Ray, and K. Shaalan, "Question classification for Arabic Question Answering Systems," in 2015 International Conference on Information and Communication Technology Research (ICTRC), Feb. 2015, pp. 310–313. DOI: https://doi.org/10.1109/ICTRC.2015.7156484

W. Yih, M. W. Chang, C. Meek, and A. Pastusiak, "Question Answering Using Enhanced Lexical Semantic Models," in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Sofia, Bulgaria, Dec. 2013, pp. 1744–1753.

F. T. AL-Khawaldeh, "Answer Extraction for Why Arabic Questions Answering Systems: EWAQ." arXiv, Jul. 04, 2019.

A. Pasha et al., "MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic," in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), Reykjavik, Iceland, Feb. 2014, pp. 1094–1101.

J. Hirschberg and C. D. Manning, "Advances in natural language processing," Science, vol. 349, no. 6245, pp. 261–266, Jul. 2015. DOI: https://doi.org/10.1126/science.aaa8685

T. Young, D. Hazarika, S. Poria, and E. Cambria, "Recent Trends in Deep Learning Based Natural Language Processing [Review Article]," IEEE Computational Intelligence Magazine, vol. 13, no. 3, pp. 55–75, Dec. 2018. DOI: https://doi.org/10.1109/MCI.2018.2840738

G. Lample and A. Conneau, "Cross-lingual Language Model Pretraining." arXiv, Jan. 22, 2019.

M. Lewis et al., "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension." arXiv, Oct. 29, 2019. DOI: https://doi.org/10.18653/v1/2020.acl-main.703

S. R. El-Beltagy and M. A. Abdallah, "Exploring Retrieval Augmented Generation in Arabic." arXiv, Aug. 14, 2024. DOI: https://doi.org/10.1016/j.procs.2024.10.203

F. Nooralahzadeh, G. Bekoulis, J. Bjerva, and I. Augenstein, "Zero-Shot Cross-Lingual Transfer with Meta Learning." arXiv, Oct. 05, 2020. DOI: https://doi.org/10.18653/v1/2020.emnlp-main.368

A. Conneau et al., "Unsupervised Cross-lingual Representation Learning at Scale." arXiv, Apr. 08, 2020. DOI: https://doi.org/10.21437/Interspeech.2021-329

L. Xue et al., "mT5: A massively multilingual pre-trained text-to-text transformer." arXiv, Mar. 11, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.41

A. Abdallah et al., "ArabicaQA: A Comprehensive Dataset for Arabic Question Answering." arXiv, Mar. 26, 2024. DOI: https://doi.org/10.1145/3626772.3657889

M. Attia and A. Elkahky, "Segmentation for Domain Adaptation in Arabic," in Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, Dec. 2019, pp. 119–129. DOI: https://doi.org/10.18653/v1/W19-4613

I. Chalkidis, M. Fergadiotis, and I. Androutsopoulos, "MultiEURLEX - A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer," in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, Aug. 2021, pp. 6974–6996. DOI: https://doi.org/10.18653/v1/2021.emnlp-main.559

A. Asai, J. Kasai, J. H. Clark, K. Lee, E. Choi, and H. Hajishirzi, "XOR QA: Cross-lingual Open-Retrieval Question Answering." arXiv, Apr. 13, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.46

M. A. Daoud, C. Abouzahir, L. Kharouf, W. Al-Eisawi, N. Habash, and F. E. Shamout, "MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks." arXiv, Aug. 22, 2025.

M. AL-Qurishi, S. AlQaseemi, and R. Soussi, "AraLegal-BERT: A pretrained language model for Arabic Legal text." arXiv, Oct. 15, 2022. DOI: https://doi.org/10.18653/v1/2022.nllp-1.31

R. Mohammad, O. S. Alkhnbashi, and M. Hammoudeh, "Optimizing Large Language Models for Arabic Healthcare Communication: A Focus on Patient-Centered NLP Applications," Big Data and Cognitive Computing, vol. 8, no. 11, Nov. 2024, Art. no. 157. DOI: https://doi.org/10.3390/bdcc8110157

E. Dimitrakis, K. Sgontzos, and Y. Tzitzikas, "A survey on question answering systems over linked data and documents," Journal of Intelligent Information Systems, vol. 55, no. 2, pp. 233–259, Oct. 2020. DOI: https://doi.org/10.1007/s10844-019-00584-7

M. Sheker, S. Saad, R. Abood, and M. Shakir, "Domain-specific ontology-based approach for Arabic question answering," Journal of Theoretical and Applied Information Technology, vol. 83, no. 1, pp. 43–51, 2016.

K. Benlaharche, Z. Laboudi, N. Nouaouria, and D. E. Zegour, "An ontology driven question answering system for fatawa retrieval," Indonesian Journal of Electrical Engineering and Computer Science, vol. 23, no. 2, pp. 980–992, Aug. 2021. DOI: https://doi.org/10.11591/ijeecs.v23.i2.pp980-992

Z. Saadaoui, G. Tlig, and F. Jarray, "LLMs Based Approach for Quranic Question Answering:," in Proceedings of the 20th International Conference on Web Information Systems and Technologies, Porto, Portugal, 2024, pp. 112–118. DOI: https://doi.org/10.5220/0013012900003825

F. Qamar, S. Latif, and R. Latif, "A Benchmark Dataset with Larger Context for Non-Factoid Question Answering over Islamic Text." arXiv, Sep. 15, 2024. DOI: https://doi.org/10.3724/2096-7004.di.2025.0065

Z. Khalila et al., "Investigating Retrieval-Augmented Generation in Quranic Studies: A Study of 13 Open-Source Large Language Models," International Journal of Advanced Computer Science and Applications, vol. 16, no. 2, 2025. DOI: https://doi.org/10.14569/IJACSA.2025.01602134

A. Mostafa and O. Mohamed, "GOF at Qur’an QA 2022: Towards an Efficient Question Answering For The Holy Qu’ran In The Arabic Language Using Deep Learning-Based Approach," in Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, Marseille, France, Mar. 2022, pp. 104–111.

S. Maged et al., "HistoryQuest: Arabic Question Answering in Egyptian History with LLM Fine-Tuning and Transformer Models," in 2024 Intelligent Methods, Systems, and Applications (IMSA), Giza, Egypt, Jul. 2024, pp. 135–140. DOI: https://doi.org/10.1109/IMSA61967.2024.10652824

W. Zaghouani, "Critical Survey of the Freely Available Arabic Corpora." arXiv, Feb. 25, 2017.

K. Darwish, H. Mubarak, and A. Abdelali, "Arabic Diacritization: Stats, Rules, and Hacks," in Proceedings of the Third Arabic Natural Language Processing Workshop, Valencia, Spain, Dec. 2017, pp. 9–17. DOI: https://doi.org/10.18653/v1/W17-1302

K. Darwish, H. Sajjad, and H. Mubarak, "Verifiably Effective Arabic Dialect Identification," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, Jul. 2014, pp. 1465–1468. DOI: https://doi.org/10.3115/v1/D14-1154

K. Darwish, "Arabizi Detection and Conversion to Arabic." arXiv, Jun. 28, 2013.

N. Habash, R. Roth, O. Rambow, R. Eskander, and N. Tomeh, "Morphological Analysis and Disambiguation for Dialectal Arabic," in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Atlanta, GA, USA, Mar. 2013, pp. 426–432.

S. Hossain, F. Shammary, B. Shammary, and H. Afli, "Enhancing Dialectal Arabic Intent Detection through Cross-Dialect Multilingual Input Augmentation," in Proceedings of the 4th Workshop on Arabic Corpus Linguistics (WACL-4), Abu Dhabi, United Arab Emirates, Jan. 2025, pp. 44–49.

A. Abdelali, H. Mubarak, Y. Samih, S. Hassan, and K. Darwish, "QADI: Arabic Dialect Identification in the Wild," in Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine Dec. 2021.

A. Abdelali, K. Darwish, N. Durrani, and H. Mubarak, "Farasa: A Fast and Furious Segmenter for Arabic," in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego, CA, USA, Mar. 2016, pp. 11–16. DOI: https://doi.org/10.18653/v1/N16-3003

Y. Belinkov and J. Glass, "Arabic Diacritization with Recurrent Neural Networks," in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, Jun. 2015, pp. 2281–2285. DOI: https://doi.org/10.18653/v1/D15-1274

A. H. Alshehri, "AraQA-BERT: Towards an Arabic Question Answering System using Pre-trained BERT Models," WSEAS Transactions on Information Science and Applications, vol. 21, pp. 361–373, 2024. DOI: https://doi.org/10.37394/23209.2024.21.34

M. A. Ali, N. Daftardar, M. Waheed, J. Qin, and D. Wang, "MQA-KEAL: Multi-hop Question Answering under Knowledge Editing for Arabic Language." arXiv, Sep. 18, 2024.

R. Tsarfaty, D. Bareket, S. Klein, and A. Seker, "From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)?" arXiv, May 04, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.660

Y. Liu et al., "Multilingual Denoising Pre-training for Neural Machine Translation," Transactions of the Association for Computational Linguistics, vol. 8, pp. 726–742, Nov. 2020. DOI: https://doi.org/10.1162/tacl_a_00343

F. Petroni et al., "KILT: a Benchmark for Knowledge Intensive Language Tasks." arXiv, May 27, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.200

L. Yu, K. M. Hermann, P. Blunsom, and S. Pulman, "Deep Learning for Answer Sentence Selection." arXiv, Dec. 04, 2014.

K. M. Hermann et al., "Teaching Machines to Read and Comprehend." arXiv, Nov. 19, 2015.

S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, Aug. 1997. DOI: https://doi.org/10.1162/neco.1997.9.8.1735

A. Vaswani et al., "Attention Is All You Need." arXiv, Aug. 02, 2023.

Y. Tang et al., "Multilingual Translation with Extensible Multilingual Pretraining and Finetuning." arXiv, Aug. 02, 2020.

E. Chang, A. Marin, and V. Demberg, "Programmable Annotation with Diversed Heuristics and Data Denoising," in Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, Jul. 2022, pp. 2681–2691.

G. Agrawal, T. Kumarage, Z. Alghamdi, and H. Liu, "Mindful-RAG: A Study of Points of Failure in Retrieval Augmented Generation," in 2024 2nd International Conference on Foundation and Large Language Models (FLLM), Dubai, United Arab Emirates, Nov. 2024, pp. 607–611. DOI: https://doi.org/10.1109/FLLM63129.2024.10852457

M. Alshammary, M. N. Uddin, and L. Khan, "RFPG: Question-Answering from Low-Resource Language (Arabic) Texts using Factually Aware RAG," in 2024 IEEE 10th International Conference on Collaboration and Internet Computing (CIC), Washington, DC, USA, Oct. 2024, pp. 107–116. DOI: https://doi.org/10.1109/CIC62241.2024.00023

M. Alsuhaibani and M. O. Beg, "Improving Domain-Specific Data Question Answering with Deep and Cross-Lingual Transfer Learning," in Machine Learning and Soft Computing, 2025, pp. 80–93. DOI: https://doi.org/10.1007/978-981-96-6400-9_7

R. Malhas, W. Mansour, and T. Elsayed, "Qur’an QA 2022: Overview of The First Shared Task on Question Answering over the Holy Qur’an," in Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, Marseille, France, Mar. 2022, pp. 79–87.

X. Yuan et al., "Multimodal Contrastive Training for Visual Representation Learning." arXiv, Apr. 26, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00692

J. B. Alayrac et al., "Flamingo: a Visual Language Model for Few-Shot Learning." arXiv, Nov. 15, 2022.

A. Hegde, A. Kumar, A. Agarwala, and B. Muralidharan, "Exploring ideas in topological quantum phenomena: A journey through the SSH model." arXiv, Aug. 03, 2021. DOI: https://doi.org/10.1007/s12045-022-1470-7

S. Alnefaie, E. Atwell, and M. A. Alsalka, "Qur’an Passage Ranking Using Transformer Models," in Arabic Language Processing: From Theory to Practice, 2025, pp. 183–194. DOI: https://doi.org/10.1007/978-3-031-79164-2_16

H. Abdelnasser et al., "Al-Bayan: An Arabic Question Answering System for the Holy Quran," in Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), Doha, Qatar, Jul. 2014, pp. 57–64. DOI: https://doi.org/10.3115/v1/W14-3607

A. Ghaddar et al., "Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding." arXiv, May 21, 2022. DOI: https://doi.org/10.18653/v1/2022.emnlp-main.205

A. Alrayzah, F. Alsolami, and M. Saleh, "Challenges and opportunities for Arabic question-answering systems: current techniques and future directions," PeerJ Computer Science, vol. 9, Oct. 2023, Art. no. e1633. DOI: https://doi.org/10.7717/peerj-cs.1633

S. Alamoudi, L. A. A. Khuzayem, and A. Jamal, "Optimizing Automated Question Generation for Educational Assessments: A Semantic Analysis of LLMs with Structured and Unstructured Ontologies," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23664–23671, Jun. 2025. DOI: https://doi.org/10.48084/etasr.10662

H. Abdelazim, M. Tharwat, and A. Mohamed, "Semantic Embeddings for Arabic Retrieval Augmented Generation (ARAG)," International Journal of Advanced Computer Science and Applications, vol. 14, no. 11, 2023. DOI: https://doi.org/10.14569/IJACSA.2023.01411135

R. Al-Rasheed et al., "Evaluating RAG Pipelines for Arabic Lexical Information Retrieval: A Comparative Study of Embedding and Generation Models," in Proceedings of the 1st Workshop on NLP for Languages Using Arabic Script, Abu Dhabi, UAE, Jan. 2025, pp. 155–164.

A. S. Alammary, "BERT Models for Arabic Text Classification: A Systematic Review," Applied Sciences, vol. 12, no. 11, Jan. 2022, Art. no. 5720. DOI: https://doi.org/10.3390/app12115720

H. Mulki and B. Ghanem, "Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language," in Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kyiv, Ukraine (Virtual), Dec. 2021, pp. 154–163.