This is a preview and has not been published. View submission

SIEmo-LSTM: Multimodal Fusion of Text, Unicode Emoji, and ASCII Emoticons for Indonesian Sentiment Analysis

Authors

  • M. Noer Fadli Hidayat Department of Electrical Engineering and Informatics, Universitas Negeri Malang, Indonesia | Informatics Engineering Department, Faculty of Engineering, Universitas Nurul Jadid, Indonesia
  • Didik Dwi Prasetya Department of Electrical Engineering and Informatics, Universitas Negeri Malang, Indonesia https://orcid.org/0000-0002-3540-2961
  • Triyanna Widiyaningtyas Department of Electrical Engineering and Informatics, Universitas Negeri Malang, Indonesia https://orcid.org/0000-0001-6104-6692
Volume: 16 | Issue: 2 | Pages: 34382-34388 | April 2026 | https://doi.org/10.48084/etasr.17398

Abstract

Sentiment analysis of Indonesian user reviews is challenged by informal language and frequent nonverbal cues such as Unicode emojis and ASCII emoticons. This study aimed to quantify the benefit of explicitly modeling ASCII emoticons together with Unicode emojis and text for three-class sentiment classification (negative/neutral/positive). SIEmo-LSTM is a tri-modal pipeline that (i) maps Unicode emojis using an emoji sentiment resource, (ii) detects, normalizes, and converts ASCII emoticons into descriptive tokens using Emot and LEED (93.3% successful conversion), and (iii) encodes the unified sequence using IndoBERT as a contextual feature extractor and refines it with a Bi-LSTM layer before multiclass prediction. Experiments used 304,570 Ruangguru app reviews (2016–2023), a tri-modal subset of 2,527 reviews, and a 70/20/10 train/validation/test split. Class imbalance was addressed using Random OverSampling (ROS). The full Text+SE+IE configuration with ROS achieved up to 0.9935 Accuracy and 0.9967 Macro-F1, outperforming text-only and text+Unicode baselines, while Random UnderSampling (RUS) consistently degraded performance. These findings imply that treating ASCII emoticons as a first-class affective modality—alongside Unicode emojis and text—improves robustness and class-balanced sentiment recognition for Indonesian user-generated reviews.

Keywords:

sentiment analysis, multimodal fusion, unicode emoji, ASCII emoticons, IndoBERT, BiLSTM

Downloads

Download data is not yet available.

References

B. Andrian, T. Simanungkalit, I. Budi, and A. F. Wicaksono, "Sentiment Analysis on Customer Satisfaction of Digital Banking in Indonesia," International Journal of Advanced Computer Science and Applications, vol. 13, no. 3, 2022.

G. Buntoro, R. Arifin, G. Syaifuddiin, A. Selamat, O. Krejcar, and F. Hamido, "The Implementation of the Machine Learning Algorithm for the Sentiment Analysis of Indonesia's 2019 Presidential Election," IIUM Engineering Journal, vol. 22, no. 1, pp. 78–92, Jan. 2021.

F. Kurniawan, B. Badruddin, and P. A. Wibawa, "Identification of islamophobia sentiment analysis on Twitter using text mining language detection," Journal of Positive School Psychology, vol. 6, no. 5, pp. 8286–8294, 2022.

N. Risa, D. D. Prasetya, W. N. Hidayat, P. I. Maula, I. M. Wirawan, and S. Y. Setiawan, "Sentiment Analysis of ‘Kampus Merdeka’ on Twitter Using Support Vector Machine (SVM) Algorithm," in 2024 IEEE 2nd International Conference on Electrical Engineering, Computer and Information Technology (ICEECIT), Nov. 2024, pp. 163–168.

R. Kusumaningrum, I. Z. Nisa, R. Jayanto, R. P. Nawangsari, and A. Wibowo, "Deep learning-based application for multilevel sentiment analysis of Indonesian hotel reviews," Heliyon, vol. 9, no. 6, June 2023, Art. no. e17147.

D. Ariyus, D. Manongga, and I. Sembiring, "Enhancing Sentiment Analysis of Indonesian Tourism Video Content Commentary on TikTok: A FastText and Bi-LSTM Approach," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18020–18028, Dec. 2024.

S. Supriyono, "Analyzing Audience Sentiments in Digital Comedy: A Study of YouTube Comments Using LSTM Models," Journal of Applied Data Sciences, vol. 5, no. 4, pp. 1877–1889, Dec. 2024.

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proceedings of the 2019 Conference of the North, 2019, pp. 4171–4186.

I. Darmawan, H. Elmunsyah, and D. D. Prasetya, "ALBERTIR: A BERT-Based Pretraining for Indonesian Religious Texts Using Qur’an and Hadith Translations," Engineering, Technology & Applied Science Research, vol. 15, no. 5, pp. 28307–28312, Oct. 2025.

L. B. Ilmawan, M. Muladi, and D. D. Prasetya, "Negation handling for sentiment analysis task: approaches and performance analysis," International Journal of Electrical and Computer Engineering (IJECE), vol. 14, no. 3, June 2024, Art. no. 3382.

L. Li and X. T. Wang, "Nonverbal communication with emojis in social media: dissociating hedonic intensity from frequency," Language Resources and Evaluation, vol. 57, no. 1, pp. 323–342, Mar. 2023.

P. Kralj Novak, J. Smailović, B. Sluban, and I. Mozetič, "Sentiment of Emojis," PLOS ONE, vol. 10, no. 12, Dec. 2015, Art. no. e0144296.

D. Rodrigues, M. Prada, R. Gaspar, M. V. Garrido, and D. Lopes, "Lisbon Emoji and Emoticon Database (LEED): Norms for emoji and emoticons in seven evaluative dimensions," Behavior Research Methods, vol. 50, no. 1, pp. 392–405, Feb. 2018.

S. Al-Azani and E. S. M. El-Alfy, "Early and Late Fusion of Emojis and Text to Enhance Opinion Mining," IEEE Access, vol. 9, pp. 121031–121045, 2021.

S. Velampalli, C. Muniyappa, and A. Saxena, "Performance Evaluation of Sentiment Analysis on Text and Emoji Data Using End-to-End, Transfer Learning, Distributed and Explainable AI Models," Journal of Advances in Information Technology, vol. 13, no. 2, 2022.

C. Liu et al., "Improving sentiment analysis accuracy with emoji embedding," Journal of Safety Science and Resilience, vol. 2, no. 4, pp. 246–252, Dec. 2021.

S. Kusal, S. Patil, and K. Kotecha, "Multimodal text-emoji fusion using deep neural networks for text-based emotion detection in online communication," Journal of Big Data, vol. 12, no. 1, Feb. 2025, Art. no. 32.

H. Zou and K. Xiang, "Sentiment Classification Method Based on Blending of Emoticons and Short Texts," Entropy, vol. 24, no. 3, Mar. 2022, Art. no. 398.

Y. J. Su, C. H. Chen, T. Y. Chen, and C. C. Cheng, "Chinese Microblog Sentiment Analysis by Adding Emoticons to Attention-Based CNN," Journal of Internet Technology, vol. 21, no. 3, pp. 821–829, May 2020.

Y. Chen, J. Yuan, Q. You, and J. Luo, "Twitter Sentiment Analysis via Bi-sense Emoji Embedding and Attention-based LSTM," in Proceedings of the 26th ACM International Conference on Multimedia, Oct. 2018, pp. 117–125.

Y. Wang, "Iteration-based naive Bayes sentiment classification of microblog multimedia posts considering emoticon attributes," Multimedia Tools and Applications, vol. 79, no. 27–28, pp. 19151–19166, July 2020.

M. A. Ullah, S. M. Marium, S. A. Begum, and N. S. Dipa, "An algorithm and method for sentiment analysis using the text and emoticon," ICT Express, vol. 6, no. 4, pp. 357–360, Dec. 2020.

F. Koto and G. Y. Rahmaningtyas, "Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs," in 2017 International Conference on Asian Language Processing (IALP), Dec. 2017, pp. 391–394.

G. Van Houdt, C. Mosquera, and G. Nápoles, "A review on the long short-term memory model," Artificial Intelligence Review, vol. 53, no. 8, pp. 5929–5955, Dec. 2020.

C. Suman, S. Saha, P. Bhattacharyya, and R. S. Chaudhari, "Emoji Helps! A Multi-modal Siamese Architecture for Tweet User Verification," Cognitive Computation, vol. 13, no. 2, pp. 261–276, Mar. 2021.

M. Fernández-Gavilanes, E. Costa-Montenegro, S. García-Méndez, F. J. González-Castaño, and J. Juncal-Martínez, "Evaluation of online emoji description resources for sentiment analysis purposes," Expert Systems with Applications, vol. 184, Dec. 2021, Art. no. 115279.

Downloads

How to Cite

[1]
M. N. F. Hidayat, D. D. Prasetya, and T. Widiyaningtyas, “SIEmo-LSTM: Multimodal Fusion of Text, Unicode Emoji, and ASCII Emoticons for Indonesian Sentiment Analysis”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 2, pp. 34382–34388, Apr. 2026.

Metrics

Abstract Views: 16
PDF Downloads: 7

Metrics Information