A Deep Visual Approach to Student Engagement Analysis Using Affective and Behavioral Cues

Fatima Zahra Jobbid; Aissam Berrahou; Hassan Berbia

doi:10.48084/etasr.13816

Authors

Fatima Zahra Jobbid Smart Systems Laboratory, ENSIAS, Mohammed V University in Rabat, Morocco https://orcid.org/0009-0000-5315-5194
Aissam Berrahou ENSIAS, Mohammed V University in Rabat, Morocco https://orcid.org/0000-0003-2787-3138
Hassan Berbia Smart Systems Laboratory, ENSIAS, Mohammed V University in Rabat, Morocco

Volume: 16 | Issue: 1 | Pages: 30913-30918 | February 2026 | https://doi.org/10.48084/etasr.13816

Received: 4 August 2025 | Revised: 16 September 2025, 8 October 2025, 10 October 2025, and 26 October 2025 | Accepted: 29 October 2025 | Online: 15 January 2026

Corresponding author: Fatima Zahra Jobbid

Abstract

Assessing student engagement in educational environments is essential to support adaptive teaching strategies and enhance learning outcomes. This study presents a deep learning-based approach for automatically predicting student engagement, leveraging both behavioral and emotional cues. The proposed method integrates features derived from facial emotion recognition and head pose estimation, capturing a comprehensive representation of student affect and attention. Using the Student Engagement Dataset, a multi-layer neural network was trained to classify engagement states based on these multimodal inputs. The proposed framework achieves an accuracy of 88% on unseen validation data, demonstrating strong effectiveness in distinguishing between engaged and disengaged students. In addition, explainability analysis highlights the importance of neutral facial expressions and head orientation as key indicators of engagement, supporting the interpretability and practical relevance of the proposed approach for real-world educational environments.

Keywords:

component, student engagement, emotion recognition, head pose estimation, deep learning, artificial intelligence in education

Downloads

Download data is not yet available.

References

J. A. Fredricks and W. McColskey, "The Measurement of Student Engagement: A Comparative Analysis of Various Methods and Student Self-report Instruments," in Handbook of Research on Student Engagement, S. L. Christenson, A. L. Reschly, and C. Wylie, Springer US, 2012, pp. 763–782. DOI: https://doi.org/10.1007/978-1-4614-2018-7_37

C. R. Henrie, L. R. Halverson, and C. R. Graham, "Measuring student engagement in technology-mediated learning: A review," Computers & Education, vol. 90, pp. 36–53, Dec. 2015. DOI: https://doi.org/10.1016/j.compedu.2015.09.005

B. A. Braiki, S. Harous, N. Zaki, and F. Alnajjar, "Artificial intelligence in education and assessment methods," Bulletin of Electrical Engineering and Informatics, vol. 9, no. 5, pp. 1998–2007, Oct. 2020. DOI: https://doi.org/10.11591/eei.v9i5.1984

R. A. Elsheikh, M. A. Mohamed, A. M. Abou-Taleb, and M. M. Ata, "Improved facial emotion recognition model based on a novel deep convolutional structure," Scientific Reports, vol. 14, no. 1, Nov. 2024, Art. no. 29050. DOI: https://doi.org/10.1038/s41598-024-79167-8

M. Rezaee, T. Perumal, F. M. Shiri, and E. Ahmadi, "Detection of Student Engagement in E-Learning Environments Using EfficientnetV2-L Together with RNN-Based Models," Journal on Artificial Intelligence, vol. 6, no. 1, pp. 85–103, 2024. DOI: https://doi.org/10.32604/jai.2024.048911

I. Alkabbany, A. M. Ali, C. Foreman, T. Tretter, N. Hindy, and A. Farag, "An Experimental Platform for Real-Time Students Engagement Measurements from Video in STEM Classrooms," Sensors, vol. 23, no. 3, Feb. 2023, Art. no. 1614. DOI: https://doi.org/10.3390/s23031614

N. Mahmood, S. M. Bhatti, H. Dawood, M. R. Pradhan, and H. Ahmad, "Measuring Student Engagement through Behavioral and Emotional Features Using Deep-Learning Models," Algorithms, vol. 17, no. 10, Oct. 2024, Art. no. 458. DOI: https://doi.org/10.3390/a17100458

Y. Zhao, J. Xu, and X. Huang, "Multimodal Engagement Recognition by Fusing Transformer and Bi-LSTM," in Emotional Intelligence, vol. 2450, X. Huang and Q. Mao, Springer Nature Singapore, 2025, pp. 173–181. DOI: https://doi.org/10.1007/978-981-96-5084-2_12

C. C. Ma et al., "Multimodal Fusion with LLMs for Engagement Prediction in Natural Conversation." arXiv, 2024.

A. Gupta, A. D’Cunha, K. Awasthi, and V. Balasubramanian, "DAiSEE: Towards User Engagement Recognition in the Wild." arXiv, 2016.

K. Delgado et al., "Student Engagement Dataset," in 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, Canada, Oct. 2021, pp. 3621–3629. DOI: https://doi.org/10.1109/ICCVW54120.2021.00405

S. Malekshahi, J. M. Kheyridoost, and O. Fatemi, "A General Model for Detecting Learner Engagement: Implementation and Evaluation." arXiv, 2024.

C. Qian, J. A. L. Marques, A. R. De Alexandria, and S. J. Fong, "Application of Multiple Deep Learning Architectures for Emotion Classification Based on Facial Expressions," Sensors, vol. 25, no. 5, Feb. 2025, Art. no. 1478. DOI: https://doi.org/10.3390/s25051478

M. Talele and R. Jain, "A Comparative Analysis of CNNs and ResNet50 for Facial Emotion Recognition," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 20693–20701, Apr. 2025. DOI: https://doi.org/10.48084/etasr.9849

Y. Nie, R. Pan, Q. Zhang, X. Xu, G. Li, and H. Cai, "Face Expression Recognition via Product-Cross Dual Attention and Neutral-Aware Anchor Loss," in Computational Visual Media, vol. 14593, F. L. Zhang and A. Sharf, Springer Nature Singapore, 2024, pp. 70–90. DOI: https://doi.org/10.1007/978-981-97-2092-7_4

W. Du, "Facial emotion recognition based on improved ResNet," Applied and Computational Engineering, vol. 21, no. 1, pp. 242–248, Oct. 2023. DOI: https://doi.org/10.54254/2755-2721/21/20231152

Y. Jin, Z. You, and N. Cai, "Simplified Inception Module Based Hadamard Attention Mechanism for Medical Image Classification," Journal of Computer and Communications, vol. 11, no. 06, pp. 1–18, 2023. DOI: https://doi.org/10.4236/jcc.2023.116001

J. Yu, Y. Liu, R. Fan, and G. Sun, "MixCut:A Data Augmentation Method for Facial Expression Recognition." arXiv, 2024.

C. Lugaresi et al., "MediaPipe: A Framework for Building Perception Pipelines." arXiv, 2019.

M. Velayuthan, A. Gawesha, P. Velayuthan, N. Kodagoda, D. Kasthurirathna, and P. Samarasinghe, "GADS: A Super Lightweight Model for Head Pose Estimation." arXiv, 2025.

X. Zhu, X. Liu, Z. Lei, and S. Z. Li, "Face Alignment in Full Pose Range: A 3D Total Solution," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 78–92, Jan. 2019. DOI: https://doi.org/10.1109/TPAMI.2017.2778152

S. Li, W. Deng, and J. Du, "Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 2017, pp. 2584–2593. DOI: https://doi.org/10.1109/CVPR.2017.277

X. Zhu, Z. Lei, X. Liu, H. Shi, and S. Z. Li, "Face Alignment Across Large Poses: A 3D Solution." [Online]. Available: http://www.cbsr.ia.ac.cn/users/xiangyuzhu/projects/3DDFA/main.htm.

P. Sharma et al., "Student Engagement Detection Using Emotion Analysis, Eye Tracking and Head Movement with Machine Learning," in Technology and Innovation in Learning, Teaching and Education, vol. 1720, A. Reis, J. Barroso, P. Martins, A. Jimoyiannis, R. Y. M. Huang, and R. Henriques, Springer Nature Switzerland, 2022, pp. 52–68. DOI: https://doi.org/10.1007/978-3-031-22918-3_5

X. Lu, H. Zhang, Q. Zhang, and X. Han, "Multi-Channel Expression Recognition Network Based on Channel Weighting," Applied Sciences, vol. 13, no. 3, Feb. 2023, Art. no. 1968. DOI: https://doi.org/10.3390/app13031968

R. Singh et al., "Efficientnet for Human fer using Transfer Learning," ICTACT Journal on Soft Computing, vol. 13, no. 1, pp. 2792–2797, Oct. 2022. DOI: https://doi.org/10.21917/ijsc.2022.0397

J. H. Chowdhury, Q. Liu, and S. Ramanna, "Simple Histogram Equalization Technique Improves Performance of VGG Models on Facial Emotion Recognition Datasets," Algorithms, vol. 17, no. 6, June 2024, Art. no. 238. DOI: https://doi.org/10.3390/a17060238

G. Xingang, A. Ang, D. Martinez, C. Chao, and S. Ziqi, "Facial expression recognition based on convolutional network attention mechanism," Insights of Automation in Manufacturing, vol. 1, no. 2, pp. 64–77, Oct. 2024. DOI: https://doi.org/10.59782/iam.v1i2.227

K. Wu and Z. Chen, "Enhancing Real-World Facial Expression Recognition: A Deep Learning Approach based on Attention Mechanisms," in 2023 3rd International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), Wuhan, China, Dec. 2023, pp. 338–342. DOI: https://doi.org/10.1109/CEI60616.2023.10527835

R. Sun, Z. Zhang, H. Liu, L. Zhao, Q. Zhou, and Z. Liu, "DacFER: Dual Attention Correction Learning for Efficient Facial Expression Recognition," in 2024 7th International Conference on Electronics Technology (ICET), Chengdu, China, May 2024, pp. 941–945. DOI: https://doi.org/10.1109/ICET61945.2024.10672990

"HPC-MARWAN." https://hpc.marwan.ma/index.php/en/.