Optimizing Automated Question Generation for Educational Assessments: A Semantic Analysis of LLMs with Structured and Unstructured Ontologies

Sumayyah Alamoudi; Lama A. Al Khuzayem; Amani Jamal

doi:10.48084/etasr.10662

Authors

Sumayyah Alamoudi Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia
Lama A. Al Khuzayem Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia
Amani Jamal Department of Computer Science, King Abdulaziz University, Jeddah, Saudi Arabia

Volume: 15 | Issue: 3 | Pages: 23664-23671 | June 2025 | https://doi.org/10.48084/etasr.10662

Received: 20 February 2025 | Revised: 22 March 2025 | Accepted: 2 April 2025 | Online: 11 May 2025

Corresponding author: Sumayyah Alamoudi

Abstract

This study explores the optimization of Automated Question Generation (AQG) for educational assessments using Large Language Models (LLMs) and ontologies. Three approaches are evaluated: template-based structured ontology question generation, LLM-based structured ontology question generation, and LLM-based flat concept list question generation, using BERT Precision, Recall, F1-score, and Semantic Similarity as performance metrics. The results show that: i) the template-based structured ontology approach achieved a BERT Precision of 0.833, Recall of 0.844, and F1-score of 0.838, with a Semantic Similarity of 0.563, ii) the LLM-based structured ontology method showed improvements with a BERT Precision of 0.856, Recall of 0.863, and F1-score of 0.859, but a lower Semantic Similarity of 0.534, and iii) the LLM-based flat concept list approach provided the best results, achieving BERT Precision, Recall, and F1-score of 0.859, along with the highest Semantic Similarity of 0.567. Despite the higher semantic similarity of the LLM-based flat concept list, qualitative analysis revealed that the unstructured ontology sometimes produced hallucinated or unrelated questions. These findings suggest that LLM-based methods provide a balance of relevance and diversity in question generation, with LLM-based flat concept list offering the most optimal results for question generation, while LLM-based structured ontology strikes a balance between Precision and Recall.

Keywords:

AI in education, ontologies, question generation, Large Language Models (LLMs)

Downloads

Download data is not yet available.

References

U. Lee et al., "Few-shot is enough: exploring ChatGPT prompt engineering method for automatic question generation in English education," Education and Information Technologies, vol. 29, no. 9, pp. 11483–11515, Jun. 2024.

Z. Wang, J. Valdez, D. Basu Mallick, and R. G. Baraniuk, "Towards Human-Like Educational Question Generation with Large Language Models," in Artificial Intelligence in Education - 23rd International Conference, AIED 2022, Proceedings, Durham, United Kingdom, 2022, vol. 13355, pp. 153–166.

K. Stancin, P. Poscic, and D. Jaksic, "Ontologies in education – state of the art," Education and Information Technologies, vol. 25, no. 6, pp. 5301–5320, Nov. 2020.

W. Villegas-Ch and J. García-Ortiz, "Enhancing Learning Personalization in Educational Environments through Ontology-Based Knowledge Representation," Computers, vol. 12, no. 10, Oct. 2023, Art. no. 199.

D. Nuzzo, E. Vakaj, H. Saadany, E. Grishti, and N. Mihindukulasooriya, "Automated Generation of Competency Questions Using Large Language Models and Knowledge Graphs," in 3rd NLP4KGc @ SEMANTICs 2024, Amsterdam, Sep. 2024.

K. Nagasaka, "Multiple-choice questions in mathematics: Automatic generation, revisited," in Proceedings 25th Asian Technology Conference in Mathematics, vol. 21785, pp. 1-15, 2020.

M. Panahiazar et al., "An Ontology for Cardiothoracic Surgical Education and Clinical Data Analytics," Studies in Health Technology and Informatics, vol 294, pp. 407-408, May 2022.

T. Raboanary and C. M. Keet, "An Architecture for Generating Questions, Answers, and Feedback from Ontologies," in Metadata and Semantic Research, vol. 1789, Springer Nature Switzerland, 2023, pp. 135–147.

H. Cheong and A. Butscher, "Physics-based simulation ontology: an ontology to support modelling and reuse of data for physics-based simulation," Journal of Engineering Design, vol. 30, no. 10–12, pp. 655–687, Dec. 2019.

G. Kurdi, J. Leo, B. Parsia, U. Sattler, and S. Al-Emari, "A Systematic Review of Automatic Question Generation for Educational Purposes," International Journal of Artificial Intelligence in Education, vol. 30, no. 1, pp. 121–204, Mar. 2020.

K. Li and Y. Zhang, "Planning First, Question Second: An LLM-Guided Method for Controllable Question Generation," in Findings of the Association for Computational Linguistics ACL 2024, Bangkok, Thailand and virtual meeting, 2024, pp. 4715–4729.

Y. S. Kıyak, Ö. Coşkun, I. İ. Budakoğlu, and C. Uluoğlu, "ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam," European Journal of Clinical Pharmacology, vol. 80, no. 5, pp. 729–735, May 2024.

T. Raboanary, S. Wang, and C. M. Keet, "Generating Answerable Questions from Ontologies for Educational Exercises," in Metadata and Semantic Research, vol. 1537, Springer International Publishing, 2022, pp. 28–40.

Q. Wang, R. Rose, N. Orita, and A. Sugawara, "Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5." arXiv, Mar. 2024.

G. Agrawal, K. Pal, Y. Deng, H. Liu, and Y.-C. Chen, "CyberQ: Generating Questions and Answers for Cybersecurity Education Using Knowledge Graph-Augmented LLMs," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 21, pp. 23164–23172, Mar. 2024.

G. Perković, A. Drobnjak, and I. Botički, "Hallucinations in LLMs: Understanding and Addressing Challenges," in 2024 47th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, May 2024, pp. 2084–2088.

T. Alsubait, B. Parsia, and U. Sattler, "Generating Multiple Choice Questions From Ontologies: How Far Can We Go?," in Knowledge Engineering and Knowledge Management, vol. 8982, Springer International Publishing, 2015, pp. 66–79.

M. DeBellis et al., "Integrating Ontologies and Large Language Models to Implement Retrieval Augmented Generation (RAG)," Applied Ontology, vol. 19, no. 4, pp. 389–407, Jan. 2025.

T. Wang, T. Takagi, M. Takagi, and A. Tamura, "An Automatic Question Generation System for High School English Education," in Proceedings of the 30th Annual Meeting of the Association for Natural Language Processing, Mar. 2024, pp. 1215–1219.

M. Al-Yahya, "Ontology-Based Multiple Choice Question Generation," The Scientific World Journal, vol. 2014, pp. 1–9, 2014.

S. Maity and A. Deroy, "The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models." arXiv, Oct. 12, 2024.

H. T. Mai, C. X. Chu, and H. Paulheim, "Do LLMs Really Adapt to Domains? An Ontology Learning Perspective," in The Semantic Web – ISWC 2024, vol. 15231, Springer Nature Switzerland, 2025, pp. 126–143.

M. F. Elahi, B. Ell, and P. Cimiano, "LexExMachinaQA: A framework for the automatic induction of ontology lexica for Question Answering over Linked Data," in Proceedings of the 4th Conference on Language, Data and Knowledge, Vienna, Austria, Sep. 2023, pp. 207–218.

A. Alnahdi, R. Aboalela, and A. Babour, "Building Bilingual Algorithm English-Arabic Ontology for Cognitive Applications," in 2017 IEEE International Conference on Cognitive Computing (ICCC), Honolulu, HI, USA, Jun. 2017, pp. 124–127.

T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi, "BERTScore: Evaluating Text Generation with BERT." arXiv, Feb. 24, 2020.

J. Wieting, T. Berg-Kirkpatrick, K. Gimpel, and G. Neubig, "Beyond BLEU:Training Neural Machine Translation with Semantic Similarity," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 4344–4355.

Mistral-7B-Instruct-v0.1-GGUF. (2024), Mistral AI. [Online]. Available: https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF.

A. H. Nassar and A. M. Elbisy, "A Machine Learning Approach to Predict Time Delays in Marine Construction Projects," Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 16125–16134, Oct. 2024.

O. Perera and J. Liu, "Exploring Large Language Models for Ontology Learning," Issues In Information Systems, vol. 25, no. 4, 2024.

A. Lo, A. Q. Jiang, W. Li, and M. Jamnik, "End-to-End Ontology Learning with Large Language Models." arXiv, 2024.

Loïc and K. Dassi, "Semantic-Based Self-Critical Training For Question Generation." arXiv, Oct. 13, 2021.

Z. Wang, K. Funakoshi, and M. Okumura, "Automatic Answerability Evaluation for Question Generation." arXiv, Feb. 26, 2024.

Vol. 15 (2025)	Vol. 7 (2017)
Vol. 14 (2024)	Vol. 6 (2016)
Vol. 13 (2023)	Vol. 5 (2015)
Vol. 12 (2022)	Vol. 4 (2014)
Vol. 11 (2021)	Vol. 3 (2013)
Vol. 10 (2020)	Vol. 2 (2012)
Vol. 9 (2019)	Vol. 1 (2011)
Vol. 8 (2018)

Optimizing Automated Question Generation for Educational Assessments

A Semantic Analysis of LLMs with Structured and Unstructured Ontologies

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License