Mutual Information-based Feature Selection Strategy for Speech Emotion Recognition using Machine Learning Algorithms Combined with the Voting Rules Method
Received: 22 September 2024 | Revised: 16 October 2024, 20 October 2024, and 21 October 2024 | Accepted: 26 October 2024 | Online: 27 November 2024
Corresponding author: Hamza Roubhi
Abstract
This study proposes a new approach to Speech Emotion Recognition (SER) that combines a Mutual Information (MI)-based feature selection strategy with simple machine learning classifiers such as K-Nearest Neighbor (KNN), Gaussian Mixture Model (GMM), and Support Vector Machine (SVM), along with a voting rule method. The main contributions of this approach are twofold. First, it significantly reduces the complexity of the SER system by addressing the curse of dimensionality by integrating a focused feature selection process, resulting in considerable savings in both computational time and memory usage. Second, it enhances classification accuracy by using selected features, demonstrating their effectiveness in improving the overall performance of the SER system. Experiments carried out on the EMODB dataset, using various feature descriptors, including Mel-frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), and Linear Prediction Cepstral Coefficients (LPCC), showed that the best performance was achieved by GMM, with an accuracy of 85.27% using 39 MFCC features, compared to an accuracy of 82.55% using a high-dimensional vector with 111 features. Furthermore, applying the Joint Mutual Information (JMI) selection technique to extracted MFCC features reduces the vector size by 23.07% while improving the accuracy to 86.82%. These results highlight the effectiveness of combining the feature selection process with machine learning algorithms and the voting rules method for the SER task.
Keywords:
speech emotion recognition, machine learning, voting rules, feature selection, mutual informationDownloads
References
R. Cowie et al., "Emotion recognition in human-computer interaction," IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32–80, Jan. 2001.
C. J. Huang et al., "Intelligent feature extraction and classification of anuran vocalizations," Applied Soft Computing, vol. 19, pp. 1–7, Jun. 2014.
A. Dey and K. Dasgupta, "Emotion Recognition Using Deep Learning in Pandemic with Real-time Email Alert," in Proceedings of Third International Conference on Communication, Computing and Electronics Systems, Coimbatore, India, 2022, pp. 175–190.
A. Dey and K. Dasgupta, "Mood Recognition in Online Sessions using Machine Learning in Realtime," in 2021 5th International Conference on Computer, Communication and Signal Processing (ICCCSP), Chennai, India, May 2021, pp. 1–6.
S. Basu, J. Chakraborty, A. Bag, and Md. Aftabuddin, "A review on emotion recognition using speech," in 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), Coimbatore, India, Mar. 2017, pp. 109–114.
I. Trabelsi and M. S. Bouhlel, "Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition:," International Journal of Synthetic Emotions, vol. 7, no. 1, pp. 58–68, Jan. 2016.
R. Subramanian and P. Aruchamy, "An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm," Circuits, Systems, and Signal Processing, vol. 43, no. 4, pp. 2477–2506, Apr. 2024.
A. Hacine-Gharbi and P. Ravier, "On the optimal number estimation of selected features using joint histogram based mutual information for speech emotion recognition," Journal of King Saud University - Computer and Information Sciences, vol. 33, no. 9, pp. 1074–1083, Nov. 2021.
F. Ghazali, A. Hacine-Gharbi, P. Ravier, and T. Mohamadi, "Extraction and selection of statistical harmonics features for electrical appliances identification using k-NN classifier combined with voting rules method," Turkish Journal of Electrical Engineering and Computer Sciences, vol. 27, no. 4, pp. 2980–2997, Jul. 2019.
F. Ghazali, A. Hacine-Gharbi, and P. Ravier, "Statistical features extraction based on the discrete wavelet transform for electrical appliances identification," in Proceedings of the 1st International Conference on Intelligent Systems and Pattern Recognition, Oct. 2020, pp. 22–26.
D. Reynolds, "Gaussian Mixture Models," in Encyclopedia of Biometrics, Boston, MA, USA: Springer US, 2009, pp. 659–663.
E. Fix and J. L. Hodges, "Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties," International Statistical Review / Revue Internationale de Statistique, vol. 57, no. 3, Dec. 1989, Art. no. 238.
C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, no. 3, pp. 273–297, Sep. 1995.
N. Yusup, A. M. Zain, and A. A. Latib, "A review of Harmony Search algorithm-based feature selection method for classification," Journal of Physics: Conference Series, vol. 1192, Mar. 2019, Art. no. 012038.
F. Z. Boukhobza, A. Hacine Gharbi, and K. Rouabah, "A New Facial Expression Recognition Algorithm Based on DWT Feature Extraction and Selection," The International Arab Journal of Information Technology, vol. 21, no. 4, 2024.
S. Bashir, I. U. Khattak, A. Khan, F. H. Khan, A. Gani, and M. Shiraz, "A Novel Feature Selection Method for Classification of Medical Data Using Filters, Wrappers, and Embedded Approaches," Complexity, vol. 2022, no. 1, Jan. 2022, Art. no. 8190814.
R. Kohavi and G. H. John, "Wrappers for feature subset selection," Artificial Intelligence, vol. 97, no. 1–2, pp. 273–324, Dec. 1997.
A. H. Gharbi, P. Ravier, and M. N. Meziane, "Relevant harmonics selection based on mutual information for electrical appliances identification," International Journal of Computer Applications in Technology, vol. 62, no. 2, 2020, Art. no. 102.
S. Nuanmeesri and W. Sriurai, "Multi-Layer Perceptron Neural Network Model Development for Chili Pepper Disease Diagnosis Using Filter and Wrapper Feature Selection Methods," Engineering, Technology & Applied Science Research, vol. 11, no. 5, pp. 7714–7719, Oct. 2021.
A. Hacine-Gharbi, M. Deriche, P. Ravier, R. Harba, and T. Mohamadi, "A new histogram-based estimation technique of entropy and mutual information using mean squared error minimization," Computers & Electrical Engineering, vol. 39, no. 3, pp. 918–933, Apr. 2013.
D. François, F. Rossi, V. Wertz, and M. Verleysen, "Resampling methods for parameter-free and robust feature selection with mutual information," Neurocomputing, vol. 70, no. 7–9, pp. 1276–1288, Mar. 2007.
G. Brown, A. Pocock, M.-J. Zhao, and M. Luján, "Conditional likelihood maximisation: a unifying framework for information theoretic feature selection," The journal of machine learning research, vol. 13, no. 1, pp. 27–66, 2012.
S. M. Hameed, W. A. Ahmed, and M. A. Othman, "Leukemia Diagnosis using Machine Learning Classifiers based on MRMR Feature Selection," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 15614–15619, Aug. 2024.
H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226–1238, Aug. 2005.
H. Yang and J. Moody, "Feature selection based on joint mutual information," in Proceedings of international ICSC symposium on advances in intelligent data analysis, 1999, vol. 23.
A. Jakulin, "Machine Learning Based on Attribute Interactions," Ph.D. dissertation, Univerza v Ljubljani, Slovenia, 2005.
D. Lin and X. Tang, "Conditional Infomax Learning: An Integrated Framework for Feature Extraction and Fusion," in Computer Vision – ECCV 2006, vol. 3951, A. Leonardis, H. Bischof, and A. Pinz, Eds. Springer Berlin Heidelberg, 2006, pp. 68–82.
R. Touahria, A. Hacine-Gharbi, P. Ravier, and M. Mostefai, "Relevant Multi Domain Features Selection Based on Mutual Information for Heart Sound Classification:," in Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods, Rome, Italy, 2024, pp. 918–923.
F. Burkhardt, A. Paeschke, M. Rolfes, W. F. Sendlmeier, and B. Weiss, "A database of German emotional speech," in Interspeech 2005, Sep. 2005, pp. 1517–1520.
Downloads
How to Cite
License
Copyright (c) 2024 Hamza Roubhi, Abdenour Hacine Gharbi, Khaled Rouabah, Philippe Ravier
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.