Evaluating Large Language Models' Proficiency in Answering Arabic GAT Exam Questions
Received: 24 July 2024 | Revised: 18 August 2024 | Accepted: 22 August 2024 | Online: 29 September 2024
Corresponding author: Moayad Alshangiti
Abstract
The Saudi General Aptitude Test (GAT) aims to measure the analytical and inferential learning abilities of high school graduates seeking admission to higher education institutions. Given the need for effective preparation tools, this study investigates the potential of chat generative pre-trained transformers to assist students in preparing for the GAT, especially in Arabic. The primary objective is to assess the effectiveness of Large Language Models (LLMs) in answering questions related to mental and logical abilities, specifically in Arabic. The performance of GPT-4, GPT-4o, and Gemini was examined through 21 experiments to determine their accuracy in answering a range of GAT-related questions. The findings indicate that although GPT-4 and GPT-4o outperformed Gemini in providing accurate answers for the GAT, their current accuracy levels still require improvement.
Keywords:
ChatGPT, GAT, standardized admissions tests, artificial intelligence, AI-powered tools, machine learning, education, Arabic languageDownloads
References
"Qiyas General Aplitude Test," National Center for Assessment. https://www.etec.gov.sa/en/qiyas.
M. Sullivan, A. Kelly, and P. Mclaughlan, "ChatGPT in higher education: Considerations for academic integrity and student learning," Journal of Applied Learning & Teaching, Jan. 2023.
K. Malinka, M. Peresíni, A. Firc, O. Hujnák, and F. Janus, "On the Educational Impact of ChatGPT: Is Artificial Intelligence Ready to Obtain a University Degree?," in Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1, Turku, Finland, Jun. 2023, pp. 47–53.
K. Uludag and M. Zhao, "Can ChatGPT Answer GRE Psychology Questions?" SSRN, Apr. 11, 2023.
U. Farooq and S. Anwar, "ChatGPT Performance on Standardized Testing Exam -- A Proposed Strategy for Learners." arXiv, Sep. 25, 2023.
W. Yeadon and D. P. Halliday, "Exploring Durham University Physics exams with Large Language Models." arXiv, Jun. 27, 2023.
J. Patel, P. Z. Robinson, E. A. Illing, and B. P. Anthony, "Is ChatGPT smarter than Otolaryngology trainees? A comparison study of board style exam questions." medRxiv, Jun. 18, 2024.
A. B. Mbakwe, I. Lourentzou, L. A. Celi, O. J. Mechanic, and A. Dagan, "ChatGPT passing USMLE shines a spotlight on the flaws of medical education," PLOS Digital Health, vol. 2, no. 2, 2023, Art. no. e0000205.
A. Gilson et al., "How Does ChatGPT Perform on the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment." medRxiv, Dec. 26, 2022.
A. Kumar, R. Sharma, and P. Bedi, "Towards Optimal NLP Solutions: Analyzing GPT and LLaMA-2 Models Across Model Scale, Dataset Size, and Task Diversity," Engineering, Technology & Applied Science Research, vol. 14, no. 3, pp. 14219–14224, Jun. 2024.
M. Alahmadi, "Evaluating Large Language Models’ Proficiency in Answering Arabic GAT Exam Questions." Zenodo, Jul. 24, 2024.
Qiyas - General Aptitude Test." https://etec.gov.sa/en/service/generalabilitytest/notes.
X. Liu et al., "Performance of ChatGPT on Clinical Medicine Entrance Examination for Chinese Postgraduate in Chinese." medRxiv, Apr. 18, 2023.
V. L. Bommineni, S. Bhagwagar, D. Balcarcel, C. Davatzikos, and D. Boyer, "Performance of ChatGPT on the MCAT: The Road to Personalized and Equitable Premedical Learning." medRxiv, Jun. 06, 2023.
N. Zaki, S. Turaev, K. Shuaib, A. Krishnan, and E. Mohamed, "Automating the mapping of course learning outcomes to program learning outcomes using natural language processing for accurate educational program evaluation," Education and Information Technologies, vol. 28, no. 12, pp. 16723–16742, Dec. 2023.
P. Giannos and O. Delardas, "Performance of ChatGPT on UK Standardized Admission Tests: Insights From the BMAT, TMUA, LNAT, and TSA Examinations," JMIR Medical Education, vol. 9, no. 1, Apr. 2023, Art. no. e47737.
T. H. Kung et al., "Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models," PLOS Digital Health, vol. 2, no. 2, 2023, Art. no. e0000198.
H. Huang et al., "AceGPT, Localizing Large Language Models in Arabic." arXiv, Apr. 02, 2024.
N. Sengupta et al., "Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models." arXiv, Sep. 29, 2023.
E. Almazrouei et al., "AlGhafa Evaluation Benchmark for Arabic Language Models," in Proceedings of ArabicNLP 2023, Sep. 2023, pp. 244–275.
A. Abdallah et al., "ArabicaQA: A Comprehensive Dataset for Arabic Question Answering," in Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA, Jul. 2024, pp. 2049–2059.
M. Alghamdi, M. Abushawarib, M. Ellouh, M. Ghaleb, and M. Felemban, "Enhancing Arabic Information Retrieval for Question Answering," in Proceedings of the 7th International Conference on Future Networks and Distributed Systems, Dubai, United Arab Emirates, Dec. 2023, pp. 366–371.
N. I. A. Hafeez, Black box 105. Saudi Arabia: Nabaa Printing and Distribution, 2021.
ETEC, "Open Data," Eucation and Training Evaluation Commission - ETEC. https://etec.gov.sa.
Downloads
How to Cite
License
Copyright (c) 2024 Mohammad D. Alahmadi, Mohammed Alharbi, Ahmad Tayeb, Moayad Alshangiti
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.