Swin Transformer with Enhanced Dropout and Layer-wise Unfreezing for Facial Expression Recognition in Mental Health Detection
Received: 30 September 2024 Revised: 25 October 2024 | Accepted: 3 November 2024 | Online: 2 December 2024
Corresponding author: Mujiyanto Mujiyanto
Abstract
This study presents an improved Facial Expression Recognition (FER) model using Swin transformers for enhanced performance in detecting mental health through facial emotion analysis. In addition, some techniques involving better dropout and layer-wise unfreezing were implemented to reduce model overfitting. This study evaluates the proposed models on benchmark datasets such as FER2013 and CK+ and real-time Genius HR data. Model A has no dropout layer, Model B has focal loss, and Model C has enhanced dropout and layer-wise unfreezing. Model C was the best among all proposed models, achieving test accuracies of 71.23% on FER2013 and 78.65% on CK+. Weighted cross-entropy loss and image augmentation were used to handle class imbalance. Based on Model C emotion predictions, a scoring mechanism was designed to analyze employees' mental health for the next 30 days. The higher the score, the higher the risk of mental health. This study demonstrates a practical version of the Swin transformer in FER models for detecting and early mental health intervention.
Keywords:
swin transformer, facial expression recognition, mental health detection, overfitting mitigationDownloads
References
A. Malik et al., "Mental health at work: WHO guidelines," World Psychiatry, vol. 22, no. 2, pp. 331–332, 2023.
J. Aina, O. Akinniyi, Md. M. Rahman, V. Odero-Marah, and F. Khalifa, "A Hybrid Learning-Architecture for Mental Disorder Detection Using Emotion Recognition," IEEE Access, vol. 12, pp. 91410–91425, 2024.
S. Minaee, M. Minaei, and A. Abdolrashidi, "Deep-Emotion: Facial Expression Recognition Using Attentional Convolutional Network," Sensors, vol. 21, no. 9, Apr. 2021, Art. no. 3046.
S. Li and W. Deng, "Deep Facial Expression Recognition: A Survey," IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1195–1215, Jul. 2022.
A. A. A. Al-zanam, O. J. A. E. H. Alhomery, and C. P. Tan, "Mental Health State Classification Using Facial Emotion Recognition and Detection," International Journal on Advanced Science Engineering Information Technology, vol. 13, no. 6, pp. 2274–2281, 2023.
S. M. Hassan, A. Alghamdi, A. Hafeez, M. Hamdi, I. Hussain, and M. Alrizq, "An Effective Combination of Textures and Wavelet Features for Facial Expression Recognition," Engineering, Technology & Applied Science Research, vol. 11, no. 3, pp. 7172–7176, Jun. 2021.
M. Mujiyanto, A. Setyanto, E. Utami, and K. Kusrini, "Facial Expression Recognition with Deep Learning and Attention Mechanisms: A Systematic Review," in 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), Semarang, Indonesia, Jul. 2024, pp. 12–17.
P. Jiang, G. Liu, Q. Wang, and J. Wu, "Accurate and Reliable Facial Expression Recognition Using Advanced Softmax Loss With Fixed Weights," IEEE Signal Processing Letters, vol. 27, pp. 725–729, 2020.
R. Vedantham, "Adaptive increasing-margin adversarial neural iterative system based on facial expression recognition feature models," Multimedia Tools and Applications, vol. 81, no. 3, pp. 3793–3830, Jan. 2022.
Y.-J. Xiong, Q. Wang, Y. Du, and Y. Lu, "Adaptive graph-based feature normalization for facial expression recognition," Engineering Applications of Artificial Intelligence, vol. 129, Mar. 2024, Art. no. 107623.
Z. Sun, C. Fu, M. Luo, and R. He, "Self-Augmented Heterogeneous Face Recognition," in 2021 IEEE International Joint Conference on Biometrics (IJCB), Shenzhen, China, Aug. 2021, pp. 1–8.
L. Wang, X. Kang, F. Ding, S. Nakagawa, and F. Ren, "A joint local spatial and global temporal CNN-Transformer for dynamic facial expression recognition," Applied Soft Computing, vol. 161, Aug. 2024, Art. no. 111680.
Y. Liu, "Deep Learning-Driven Real-Time Facial Expression Tracking and Analysis in Virtual Reality," Applied Mathematics and Nonlinear Sciences, vol. 9, no. 1, Jan. 2024, Art. no. 20242283.
A. Barman and P. Dutta, "Facial expression recognition using Reversible Neural Network," Applied Soft Computing, vol. 162, Sep. 2024, Art. no. 111815.
H. V. Manalu and A. P. Rifai, "Detection of human emotions through facial expressions using hybrid convolutional neural network-recurrent neural network algorithm," Intelligent Systems with Applications, vol. 21, Mar. 2024, Art. no. 200339.
J. Zhang, W. Wang, X. Li, and Y. Han, "Recognizing facial expressions based on pyramid multi-head grid and spatial attention network," Computer Vision and Image Understanding, vol. 244, Jul. 2024, Art. no. 104010.
E. S. Agung, A. P. Rifai, and T. Wijayanto, "Image-based facial emotion recognition using convolutional neural network on emognition dataset," Scientific Reports, vol. 14, no. 1, Jun. 2024, Art. no. 14429.
X. Chen, X. Zheng, K. Sun, W. Liu, and Y. Zhang, "Self-supervised vision transformer-based few-shot learning for facial expression recognition," Information Sciences, vol. 634, pp. 206–226, Jul. 2023.
M. Bie, H. Xu, Y. Gao, K. Song, and X. Che, "Swin-FER: Swin Transformer for Facial Expression Recognition," Applied Sciences, vol. 14, no. 14, Jul. 2024, Art. no. 6125.
A. Vats and A. Chadha, "Facial Expression Recognition using Squeeze and Excitation-powered Swin Transformers." arXiv, Apr. 29, 2023.
A. Lin, B. Chen, J. Xu, Z. Zhang, G. Lu, and D. Zhang, "DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation," IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–15, 2022.
L. Qin et al., "SwinFace: A Multi-Task Transformer for Face Recognition, Expression Recognition, Age Estimation and Attribute Estimation," IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 4, pp. 2223–2234, Apr. 2024.
S. Han, H. Chang, Z. Shi, and S. Hu, "Facial Expression Recognition Algorithm Based on Swin Transformer," in 2023 9th International Conference on Systems and Informatics (ICSAI), Changsha, China, Dec. 2023, pp. 1–6.
H. Feng, W. Huang, D. Zhang, and B. Zhang, "Fine-Tuning Swin Transformer and Multiple Weights Optimality-Seeking for Facial Expression Recognition," IEEE Access, vol. 11, pp. 9995–10003, 2023.
Y. Wu, A. Xiong, J. Lai, J. Liang, and J. Chen, "DFF: Deformable Attention Transformer-Based with Facial Feature Fusion Network for Facial Express Recognition," in 2023 IEEE International Conference on Unmanned Systems (ICUS), Hefei, China, Oct. 2023, pp. 984–989.
T. Chen, T. Pu, H. Wu, Y. Xie, L. Liu, and L. Lin, "Cross-Domain Facial Expression Recognition: A Unified Evaluation Benchmark and Adversarial Graph Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 12, pp. 9887–9903, Dec. 2022.
Z. Liu et al., "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, Oct. 2021, pp. 9992–10002.
N. Li, Y. Huang, Z. Wang, Z. Fan, X. Li, and Z. Xiao, "Enhanced Hybrid Vision Transformer with Multi-Scale Feature Integration and Patch Dropping for Facial Expression Recognition," Sensors, vol. 24, no. 13, Jan. 2024, Art. no. 4153.
K. Wu, H. Peng, M. Chen, J. Fu, and H. Chao, "Rethinking and Improving Relative Position Encoding for Vision Transformer," in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, Oct. 2021, pp. 10013–10021.
F. Scala, A. Ceschini, M. Panella, and D. Gerace, "A General Approach to Dropout in Quantum Neural Networks," Advanced Quantum Technologies, Art. no. 2300220.
I. J. Goodfellow et al., "Challenges in Representation Learning: A Report on Three Machine Learning Contests," in Neural Information Processing, 2013, pp. 117–124.
J. Yang, Z. Lv, K. Kuang, S. Yang, L. Xiao, and Q. Tang, "RASN: Using Attention and Sharing Affinity Features to Address Sample Imbalance in Facial Expression Recognition," IEEE Access, vol. 10, pp. 103264–103274, 2022.
F. Xue, Q. Wang, Z. Tan, Z. Ma, and G. Guo, "Vision Transformer With Attentive Pooling for Robust Facial Expression Recognition," IEEE Transactions on Affective Computing, vol. 14, no. 4, pp. 3244–3256, Jul. 2023.
O. S. Ekundayo and S. Viriri, "Facial Expression Recognition: A Review of Trends and Techniques," IEEE Access, vol. 9, pp. 136944–136973, 2021.
G. Simcock et al., "Associations between Facial Emotion Recognition and Mental Health in Early Adolescence," International Journal of Environmental Research and Public Health, vol. 17, no. 1, Jan. 2020, Art. no. 330.
Downloads
How to Cite
License
Copyright (c) 2024 Mujiyanto Mujiyanto, Arief Setyanto, Kusrini Kusrini, Ema Utami
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.