Arabic Sentiment Analysis for Student Evaluation using Machine Learning and the AraBERT Transformer
Received: 31 August 2023 | Revised: 13 September 2023 | Accepted: 18 September 2023 | Online: 13 October 2023
Corresponding author: Nahla Aljojo
Abstract
Recently, Sentiment Analysis (SA) has become a crucial area of research as it enables us to gauge people's opinions from various sources such as student evaluations, social media posts, product reviews, etc. This paper aims to create an Arabic dataset derived from student satisfaction surveys conducted at the University of Jeddah regarding their subjects and instructors. In addition, this study presents an evaluation of classical machine learning models such as Naive Bayes, Support Vector Machine, Logistic Regression, Decision Tree, and Random Forest classifier for Arabic SA, whereas the results are compared using various metrics. Furthermore, AraBERT was used for the pre-trained transformer to improve the performance, achieving an accuracy of 78%. The paper fills the lack of SA research in the education domain in the Arabic language.
Keywords:
sentiment analysis, natural language processing, machine learning, pre-trained transformerDownloads
References
H. Kennedy, "Perspectives on Sentiment Analysis," Journal of Broadcasting & Electronic Media, vol. 56, no. 4, pp. 435–450, Oct. 2012.
F. S. Dolianiti, D. Iakovakis, S. B. Dias, S. Hadjileontiadou, J. A. Diniz, and L. Hadjileontiadis, "Sentiment Analysis Techniques and Applications in Education: A Survey," in International Conference on Technology and Innovation in Learning, Teaching and Education, Thessaloniki, Greece, Jun. 2018, pp. 412–427.
M. Hilario, D. Esenarro, I. Petrlik, and C. Rodriguez, "Systematic Literature Review of Sentiment Analysis Techniques," Journal of Contemporary Issues in Business and Government, vol. 27, no. 1, pp. 506–517, 2021.
T. Alqurashi, "Arabic Sentiment Analysis for Twitter Data: A Systematic Literature Review," Engineering, Technology & Applied Science Research, vol. 13, no. 2, pp. 10292–10300, Apr. 2023.
N. Boudad, R. Faizi, R. Oulad Haj Thami, and R. Chiheb, "Sentiment analysis in Arabic: A review of the literature," Ain Shams Engineering Journal, vol. 9, no. 4, pp. 2479–2490, Dec. 2018.
H. Rahab, M. Djoudi, and A. Zitouni, "Sentiment Analysis of Arabic Documents: Main Challenges and Recent Advances," in Natural Language Processing for Global and Local Business, Hershey, PA, USA: IGI Global, 2021, pp. 307–331.
H. AlSalman, "An Improved Approach for Sentiment Analysis of Arabic Tweets in Twitter Social Media," in 3rd International Conference on Computer Applications & Information Security, Riyadh, Saudi Arabia, Mar. 2020.
H. Newman and D. Joyner, "Sentiment Analysis of Student Evaluations of Teaching," in International Conference on Artificial Intelligence in Education, London, UK, Jun. 2018, pp. 246–250.
D. F. Sengkey, A. Jacobus, and F. J. Manoppo, "Implementing Support Vector Machine Sentiment Analysis to Students’ Opinion toward Lecturer in an Indonesian Public University," Journal of Sustainable Engineering: Proceedings Series, vol. 1, no. 2, pp. 194–198, Sep. 2019.
M. A. Kausar, S. O. Fageeri, and A. Soosaimanickam, "Sentiment Classification based on Machine Learning Approaches in Amazon Product Reviews," Engineering, Technology & Applied Science Research, vol. 13, no. 3, pp. 10849–10855, Jun. 2023.
D. Goularas and S. Kamis, "Evaluation of Deep Learning Techniques in Sentiment Analysis from Twitter Data," in International Conference on Deep Learning and Machine Learning in Emerging Applications, Istanbul, Turkey, Aug. 2019, pp. 12–17.
M. H. Munna, M. R. I. Rifat, and A. S. M. Badrudduza, "Sentiment Analysis and Product Review Classification in E-commerce Platform," in 23rd International Conference on Computer and Information Technology, Dhaka, Bangladesh, Dec. 2020.
D. Elangovan and V. Subedha, "Adaptive Particle Grey Wolf Optimizer with Deep Learning-based Sentiment Analysis on Online Product Reviews," Engineering, Technology & Applied Science Research, vol. 13, no. 3, pp. 10989–10993, Jun. 2023.
A. Q. Al-Bayati, A. S. Al-Araji, and S. H. Ameen, "Arabic Sentiment Analysis (ASA) Using Deep Learning Approach," Journal of Engineering, vol. 26, no. 6, pp. 85–93, Jun. 2020.
A. Al-Hassan and H. Al-Dossari, "Detection of hate speech in Arabic tweets using deep learning," Multimedia Systems, vol. 28, no. 6, pp. 1963–1974, Dec. 2022.
A. Onan, "Mining opinions from instructor evaluation reviews: A deep learning approach," Computer Applications in Engineering Education, vol. 28, no. 1, pp. 117–138, 2020.
A. Alshutayri et al., "Evaluating sentiment analysis for Arabic Tweets using machine learning and deep learning," Romanian Journal of Information Technology and Automatic Control, vol. 32, no. 4, pp. 7–18, 2022.
W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based Model for Arabic Language Understanding." arXiv, Mar. 07, 2021.
H. Chouikhi, H. Chniter, and F. Jarray, "Arabic Sentiment Analysis Using BERT Model," in International Conference on Computational Collective Intelligence, Rhodes, Greece, Oct. 2021, pp. 621–632.
R. A. Alsuhemi and S. M. Zarbah, "Machine Learning and AraBERT Models for Arabic Online Reviews Sentiment Analysis," Romanian Journal of Information Technology and Automatic Control, pp. 1–14, 2022.
H. El Moubtahij, H. Abdelali, and E. B. Tazi, "AraBERT transformer model for Arabic comments and reviews analysis," IAES International Journal of Artificial Intelligence, vol. 11, no. 1, pp. 379–387, Mar. 2022.
"Resampling strategies for imbalanced datasets." https://kaggle.com/code/rafjaa/resampling-strategies-for-imbalanced-datasets.
T. Joachims, "Text categorization with Support Vector Machines: Learning with many relevant features," in European Conference on Machine Learning, Chemnitz, Germany, Apr. 1998, pp. 137–142.
V. Kecman, "Support Vector Machines – An Introduction," in Support Vector Machines: Theory and Applications, L. Wang, Ed. Berlin, Heidelberg: Springer, 2005, pp. 1–47.
D. Berrar, "Bayes’ Theorem and Naive Bayes Classifier," in Encyclopedia of Bioinformatics and Computational Biology, Amsterdam, Netherlands: Elsevier, 2018, pp. 403–412.
S. T. Indra, L. Wikarsa, and R. Turang, "Using logistic regression method to classify tweets into the selected topics," in International Conference on Advanced Computer Science and Information Systems, Malang, Indonesia, Oct. 2016.
B. Charbuty and A. Abdulazeez, "Classification Based on Decision Tree Algorithm for Machine Learning," Journal of Applied Science and Technology Trends, vol. 2, no. 1, pp. 20–28, Mar. 2021.
J. L. Speiser, M. E. Miller, J. Tooze, and E. Ip, "A comparison of random forest variable selection methods for classification prediction modeling," Expert Systems with Applications, vol. 134, pp. 93–101, Nov. 2019.
L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.
Downloads
How to Cite
License
Copyright (c) 2023 Huda Alamoudi, Nahla Aljojo, Asmaa Munshi, Abdullah Alghoson, Ameen Banjar, Araek Tashkandi, Anas Al-Tirawi, Iqbal Alsaleh
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.