An Efficient Multi-modal Facial Gesture-based Ensemble Classification and Reaction to Sound Framework for Large Video Sequences


  • SaiTeja Chopparapu Department of EECE, GITAM (Deemed to be University), India
  • Joseph Beatrice Seventline HoD, Department of EECE, GITAM (Deemed to be University), India
Volume: 13 | Issue: 4 | Pages: 11263-11270 | August 2023 |


Machine learning-based feature extraction and classification models play a vital role in evaluating and detecting patterns in multivariate facial expressions. Most conventional feature extraction and multi-modal pattern detection models are independent of filters for multi-class classification problems. In traditional multi-modal facial feature extraction models, it is difficult to detect the dependent correlated feature sets and use ensemble classification processes. This study used advanced feature filtering, feature extraction measures, and ensemble multi-class expression prediction to optimize the efficiency of feature classification. A filter-based multi-feature ranking-based voting framework was implemented on different multiple-based classifiers. Experimental results were evaluated on different multi-modal facial features for the automatic emotions listener using a speech synthesis library. The evaluation results showed that the proposed model had better feature classification, feature selection, prediction, and runtime than traditional approaches on heterogeneous facial databases.


multi-modal facial features, feature ranking, multi-modal outlier component


Download data is not yet available.


B. Zou, Y. Wang, X. Zhang, X. Lyu, and H. Ma, "Concordance between facial micro-expressions and physiological signals under emotion elicitation," Pattern Recognition Letters, vol. 164, pp. 200–209, Dec. 2022.

Y. Zhu, T. Peng, S. Su, and C. Li, "Neighbor-consistent multi-modal canonical correlations for feature fusion," Infrared Physics & Technology, vol. 123, Jun. 2022, Art. no. 104057.

Y. Zhang, Y. Chen, and C. Gao, "Deep unsupervised multi-modal fusion network for detecting driver distraction," Neurocomputing, vol. 421, pp. 26–38, Jan. 2021.

L. Zhang and X. Wu, "Multi-task framework based on feature separation and reconstruction for cross-modal retrieval," Pattern Recognition, vol. 122, Feb. 2022, Art. no. 108217.

J. Zhang, L. Xing, Z. Tan, H. Wang, and K. Wang, "Multi-head attention fusion networks for multi-modal speech emotion recognition," Computers & Industrial Engineering, vol. 168, Jun. 2022, Art. no. 108078.

D. Zeng, S. Zhao, J. Zhang, H. Liu, and K. Li, "Expression-tailored talking face generation with adaptive cross-modal weighting," Neurocomputing, vol. 511, pp. 117–130, Oct. 2022.

W. Yu and H. Xu, "Co-attentive multi-task convolutional neural network for facial expression recognition," Pattern Recognition, vol. 123, Mar. 2022, Art. no. 108401.

S. Wang, Z. Wu, G. He, S. Wang, H. Sun, and F. Fan, "Semi-supervised classification-aware cross-modal deep adversarial data augmentation," Future Generation Computer Systems, vol. 125, pp. 194–205, Dec. 2021.

J. Yu, Y. Feng, R. Li, and Y. Gao, "Part-facial relational and modality-style attention networks for heterogeneous face recognition," Neurocomputing, vol. 494, pp. 1–12, Jul. 2022.

Y. Yaddaden, "An efficient facial expression recognition system with appearance-based fused descriptors," Intelligent Systems with Applications, vol. 17, Feb. 2023, Art. no. 200166.

Z. Xing and Y. He, "Multi-modal information analysis for fault diagnosis with time-series data from power transformer," International Journal of Electrical Power & Energy Systems, vol. 144, Jan. 2023, Art. no. 108567108567.

W. Xiaohua, P. Muzi, P. Lijuan, H. Min, J. Chunhua, and R. Fuji, "Two-level attention with two-stage multi-task learning for facial emotion recognition," Journal of Visual Communication and Image Representation, vol. 62, pp. 217–225, Jul. 2019.

A. B. S. Salamh and H. I. Akyüz, "A Novel Feature Extraction Descriptor for Face Recognition," Engineering, Technology & Applied Science Research, vol. 12, no. 1, pp. 8033–8038, Feb. 2022.

A. Alsheikhy, Y. Said, and M. Barr, "Logo Recognition with the Use of Deep Convolutional Neural Networks," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6191–6194, Oct. 2020.

H. Wen, S. You, and Y. Fu, "Cross-modal dynamic convolution for multi-modal emotion recognition," Journal of Visual Communication and Image Representation, vol. 78, Jul. 2021, Art. no. 103178.

Q. Wang, M. Wang, Y. Yang, and X. Zhang, "Multi-modal emotion recognition using EEG and speech signals," Computers in Biology and Medicine, vol. 149, Oct. 2022, Art. no. 105907.

M. Wang, Z. Huang, Y. Li, L. Dong, and H. Pan, "Maximum weight multi-modal information fusion algorithm of electroencephalographs and face images for emotion recognition," Computers & Electrical Engineering, vol. 94, Sep. 2021, Art. no. 107319.

L. C. O. Tiong, S. T. Kim, and Y. M. Ro, "Multimodal facial biometrics recognition: Dual-stream convolutional neural networks with multi-feature fusion layers," Image and Vision Computing, vol. 102, Oct. 2020, Art. no. 103977.

Y. Tian, S. Sun, Z. Qi, Y. Liu, and Z. Wang, "Non-tumorous facial pigmentation classification based on multi-view convolutional neural network with attention mechanism," Neurocomputing, vol. 483, pp. 370–385, Apr. 2022.

C. Suman, S. Saha, A. Gupta, S. K. Pandey, and P. Bhattacharyya, "A multi-modal personality prediction system," Knowledge-Based Systems, vol. 236, Jan. 2022, Art. no. 107715.

Z. Shen, A. Elibol, and N. Y. Chong, "Multi-modal feature fusion for better understanding of human personality traits in social human–robot interaction," Robotics and Autonomous Systems, vol. 146, Dec. 2021, Art. no. 103874.

Y. Said, M. Barr, and H. E. Ahmed, "Design of a Face Recognition System based on Convolutional Neural Network (CNN)," Engineering, Technology & Applied Science Research, vol. 10, no. 3, pp. 5608–5612, Jun. 2020.

S. Saxena, S. Tripathi, and T. S. B. Sudarshan, "An intelligent facial expression recognition system with emotion intensity classification," Cognitive Systems Research, vol. 74, pp. 39–52, Aug. 2022.

N. Sankaran, D. D. Mohan, N. N. Lakshminarayana, S. Setlur, and V. Govindaraju, "Domain adaptive representation learning for facial action unit recognition," Pattern Recognition, vol. 102, Jun. 2020, Art. no. 107127.

E. S. Salama, R. A. El-Khoribi, M. E. Shoman, and M. A. Wahby Shalaby, "A 3D-convolutional neural network framework with ensemble learning techniques for multi-modal emotion recognition," Egyptian Informatics Journal, vol. 22, no. 2, pp. 167–176, Jul. 2021.

U. Saeed, "Facial micro-expressions as a soft biometric for person recognition," Pattern Recognition Letters, vol. 143, pp. 95–103, Mar. 2021.

M. Ren, W. Nie, A. Liu, and Y. Su, "Multi-modal Correlated Network for emotion recognition in speech," Visual Informatics, vol. 3, no. 3, pp. 150–155, Sep. 2019.

N. Rathour, R. Singh, A. Gehlot, S. Vaseem Akram, A. Kumar Thakur, and A. Kumar, "The decadal perspective of facial emotion processing and Recognition: A survey," Displays, vol. 75, Dec. 2022, Art. no. 102330.

D. G. Nair, J. J. Nair, K. Jaideep Reddy, and C. V. Aswartha Narayana, "A privacy preserving diagnostic collaboration framework for facial paralysis using federated learning," Engineering Applications of Artificial Intelligence, vol. 116, Nov. 2022, Art. no. 105476.

R. K. Mishra, S. Urolagin, J. A. Arul Jothi, and P. Gaur, "Deep hybrid learning for facial expression binary classifications and predictions," Image and Vision Computing, vol. 128, Dec. 2022, Art. no. 104573.

C. SaiTeja and J. B. Seventline, "A hybrid learning framework for multi-modal facial prediction and recognition using improvised non-linear SVM classifier," AIP Advances, vol. 13, no. 2, Feb. 2023, Art. no. 025316.

J. Liao, Y. Lin, T. Ma, S. He, X. Liu, and G. He, "Facial Expression Recognition Methods in the Wild Based on Fusion Feature of Attention Mechanism and LBP," Sensors, vol. 23, no. 9, Jan. 2023, Art. no. 4204.

J. Zhong, T. Chen, and L. Yi, "Face expression recognition based on NGO-BILSTM model," Frontiers in Neurorobotics, vol. 17, 2023.

D. Mamieva, A. B. Abdusalomov, M. Mukhiddinov, and T. K. Whangbo, "Improved Face Detection Method via Learning Small Faces on Hard Images Based on a Deep Learning Approach," Sensors, vol. 23, no. 1, Jan. 2023, Art. no. 502.


How to Cite

S. Chopparapu and J. B. Seventline, “An Efficient Multi-modal Facial Gesture-based Ensemble Classification and Reaction to Sound Framework for Large Video Sequences”, Eng. Technol. Appl. Sci. Res., vol. 13, no. 4, pp. 11263–11270, Aug. 2023.


Abstract Views: 453
PDF Downloads: 294

Metrics Information