This is a preview and has not been published. View submission

A Comparative Study of TF-IDF and Count Vectorizer under Random State Changes in a Random Forest Classifier for Emotion Detection

Authors

Volume: 16 | Issue: 2 | Pages: 33247-33252 | April 2026 | https://doi.org/10.48084/etasr.16158

Abstract

In machine learning processes, parameter settings affect model accuracy. Text-based emotion detection requires stable and accurate models, making parameter choices, such as the random state, increasingly important. Previous studies usually set the random state to 42, claiming that this should be the best for obtaining good accuracy. This study examined random state settings, experimenting with values from 1 to 720 and observing the results in accuracy. In addition, a dataset was employed for emotion detection using the Random Forest (RF) classifier with two vectorizers, TF-IDF and Count. The results show that different random state settings affect model accuracy. In the training subset, the TF-IDF vectorizer offered higher and more stable accuracy than the Count vectorizer. However, the Count vectorized achieved higher accuracy on both the validation and test sets.

Keywords:

emotion detection, random forest, TF-IDF, Count, vectorizers, random state

Downloads

Download data is not yet available.

References

E. Domínguez-García and P. Fernández-Berrocal, "The Association Between Emotional Intelligence and Suicidal Behavior: A Systematic Review," Frontiers in Psychology, vol. 9, Nov. 2018, Art. no. 2380.

Y. Richard, N. Tazi, D. Frydecka, M. S. Hamid, and A. A. Moustafa, "A systematic review of neural, cognitive, and clinical studies of anger and aggression," Current Psychology, vol. 42, no. 20, pp. 17174–17186, July 2023.

D. G. Trimble and A. Chandran, "Associations Between Sad Feelings and Suicide Behaviors in the 2019 Youth Risk Behavior Survey: A Call for Action," Frontiers in Pediatrics, vol. 9, Sept. 2021, Art. no. 694819.

H. Turton, K. Berry, A. Danquah, and D. Pratt, "The relationship between emotion dysregulation and suicide ideation and behaviour: A systematic review," Journal of Affective Disorders Reports, vol. 5, July 2021, Art. no. 100136.

L. Wang, Q. Cui, J. Liu, and H. Zou, "Emotion Reactivity and Suicide Risk in Patients With Depression: The Mediating Role of Non-Suicidal Self-Injury and Moderating Role of Childhood Neglect," Frontiers in Psychiatry, vol. 12, Oct. 2021, Art. no. 707181.

S. Hu, D. Mo, P. Guo, H. Zheng, X. Jiang, and H. Zhong, "Correlation between suicidal ideation and emotional memory in adolescents with depressive disorder," Scientific Reports, vol. 12, no. 1, Mar. 2022, Art. no. 5470.

J. Guo, "Deep learning approach to text analysis for human emotion detection from big data," Journal of Intelligent Systems, vol. 31, no. 1, pp. 113–126, Jan. 2022.

H. A. Uymaz and S. K. Metin, "Vector based sentiment and emotion analysis from text: A survey," Engineering Applications of Artificial Intelligence, vol. 113, Aug. 2022, Art. no. 104922.

K. Machová, M. Szabóova, J. Paralič, and J. Mičko, "Detection of emotion by text analysis using machine learning," Frontiers in Psychology, vol. 14, Sept. 2023, Art. no. 1190326.

A. I. Siam, N. F. Soliman, A. D. Algarni, F. E. Abd El-Samie, and A. Sedik, "Deploying Machine Learning Techniques for Human Emotion Detection," Computational Intelligence and Neuroscience, vol. 2022, pp. 1–16, Feb. 2022.

F. Limami, B. Hdioud, and R. Oulad Haj Thami, "Contextual emotion detection in images using deep learning," Frontiers in Artificial Intelligence, vol. 7, June 2024, Art. no. 1386753.

T. D. Tran, "TTNet: A novel machine learning model for facial emotion detection in online learning systems," SoftwareX, vol. 27, Sept. 2024, Art. no. 101787.

J. Raval, N. K. Jadav, S. Tanwar, G. Pau, F. Alqahtani, and A. Tolba, "Criminal emotion detection framework using convolutional neural network for public safety," Scientific Reports, vol. 15, no. 1, May 2025, Art. no. 15279.

C. Zhang, "Real-time emotion detection and customized interaction analysis in broadcast hosting," Journal of Computational Methods in Sciences and Engineering, vol. 25, no. 2, pp. 1223–1237, Mar. 2025.

S. Kooptiwoot, S. Kooptiwoot, and B. Javadi, "Application of regression decision tree and machine learning algorithms to examine students’ online learning preferences during COVID-19 pandemic," International Journal of Education and Practice, vol. 12, no. 1, pp. 82–94, Jan. 2024.

S. Schallenberg et al., "AI-powered spatial cell phenomics enhances risk stratification in non-small cell lung cancer," Nature Communications, vol. 16, no. 1, Nov. 2025, Art. no. 9701.

H. Boutouta, A. Lakhfif, F. Senator, and C. Mediani, "A Transformer-based Hybrid Model for Implicit Emotion Recognition in Arabic Text," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23834–23839, June 2025.

W. Q. A. Saif, M. K. Alshammari, B. A. Mohammed, and A. A. Sallam, "Enhancing Emotion Detection in Textual Data: A Comparative Analysis of Machine Learning Models and Feature Extraction Techniques," Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 16471–16477, Oct. 2024.

S. C. Venkateswarlu, S. R. Jeevakala, N. U. Kumar, P. Munaswamy, and D. Pendyala, "Emotion Recognition From Speech and Text using Long Short-Term Memory," Engineering, Technology & Applied Science Research, vol. 13, no. 4, pp. 11166–11169, Aug. 2023.

B. Lavaquiol-Colell et al., "Parameter configuration to maximize accuracy in point clouds acquired with lidar-based terrestrial mobile laser scanners in fruit tree orchards," Smart Agricultural Technology, vol. 12, Dec. 2025, Art. no. 101573.

J. Mendes, J. Lima, L. Costa, E. M. T. Hendrix, and A. I. Pereira, "Impact of hyper-parameter tuning on CNN accuracy in agricultural image classification," Smart Agricultural Technology, vol. 11, Aug. 2025, Art. no. 101016.

M. Pezeshki and S. Embretson, "Impact of Parameter Predictability and Joint Modeling of Response Accuracy and Response Time on Ability Estimates," Applied Psychological Measurement, vol. 49, no. 6, pp. 247–265, Sept. 2025.

H. Meng, X. Yu, B. Chen, P. Ren, and J. Zhao, "Parameter Estimations on Measurement Accuracy for Thermal Conductivity of Wood Using the Transient Plane Source Method," Forests, vol. 15, no. 10, Oct. 2024.

P. K. Kumar and I. Kumar, "Emotion Detection and Sentiment Analysis of Text," in Proceedings of the International Conference on Innovative Computing & Communication, 2021.

M. Adeel, Z. Y. Tao, S. Y. Jin, C. J. Guo, and M. Alsuhaibani, "Assessing random forest performance in low resource speech emotion recognition," Scientific Reports, vol. 16, no. 1, Dec. 2025, Art. no. 854.

P. Nandwani and R. Verma, "A review on sentiment analysis and emotion detection from text," Social Network Analysis and Mining, vol. 11, no. 1, Dec. 2021, Art. no. 81.

M. M. Rezapour Mashhadi and K. Osei-Bonsu, "Speech emotion recognition using machine learning techniques: Feature extraction and comparison of convolutional neural network and random forest," PLOS ONE, vol. 18, no. 11, Nov. 2023, Art. no. e0291500.

B. Saritha, G. Purnachandrarao, D. S. Gouthami, K. Nandini, D. H. V. Varma, and N. Anaparthi, "Emotion Detection in Text: Leveraging Machine Learning for Sentiment and Emotional Intelligence Analysis," in Proceedings of International Conference on Computer Science and Communication Engineering (ICCSCE 2025), vol. 124, 2025, pp. 2538–2549.

L. Yang and A. Shami, "On hyperparameter optimization of machine learning algorithms: Theory and practice," Neurocomputing, vol. 415, pp. 295–316, Nov. 2020.

"Emotions dataset for NLP." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/praveengovi/emotions-dataset-for-nlp.

M. M. Danyal, S. S. Khan, M. Khan, S. Ullah, M. B. Ghaffar, and W. Khan, "Sentiment analysis of movie reviews based on NB approaches using TF–IDF and count vectorizer," Social Network Analysis and Mining, vol. 14, no. 1, Apr. 2024, Art. no. 87.

T. K. Deo, R. K. Deshmukh, and G. Sharma, "Comparative Study among Term Frequency-Inverse Document Frequency and Count Vectorizer towards K Nearest Neighbor and Decision Tree Classifiers for Text Dataset," Nepal Journal of Multidisciplinary Research, vol. 7, no. 2, pp. 1–11, July 2024.

Downloads

How to Cite

[1]
S. Kooptiwoot and S. Kooptiwoot, “A Comparative Study of TF-IDF and Count Vectorizer under Random State Changes in a Random Forest Classifier for Emotion Detection”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 2, pp. 33247–33252, Apr. 2026.

Metrics

Abstract Views: 83
PDF Downloads: 47

Metrics Information