Exploring Sentiment Analysis on Social Media Texts
Received: 12 March 2024 | Revised: 24 March 2024 | Accepted: 2 April 2024 2024 | Online: 1 June 2024
Corresponding author: Mohd Anul Haq
Abstract
Sentiment analysis is a critical component in understanding customer opinions and reactions. This study explores the application of sentiment analysis using Python on the Amazon Fine Food Reviews dataset to classify customer reviews as positive or negative, enabling businesses to gain valuable insight into customer sentiments. This study used and compared the efficiency of Logistic Regression, Support Vector Machines, Random Forest, XGBoost, LSTM, and ALBERT. The comparison results showed that the LSTM and ALBERT classifiers stand out with remarkable accuracy (96%) and substantial support for positive and negative reviews. On the other hand, although the Random Forest classifier had similar accuracy (96%), it exhibited lower support for positive and negative sentiments.
Keywords:
LSTM, XGBOOST, sentiment analysis, classification, ALBERT, regression, SVMDownloads
References
H. M. Chen, P. C. Franks, and L. Evans, "Exploring Government Uses of Social Media through Twitter Sentiment Analysis," Journal of Digital Information Management, vol. 14, no. 5, Oct. 2016, Art. no. 290.
L. C. Chen, C. M. Lee, and M. Y. Chen, "Exploration of social media for sentiment analysis using deep learning," Soft Computing, vol. 24, no. 11, pp. 8187–8197, Jun. 2020.
M. H. Abd El-Jawad, R. Hodhod, and Y. M. K. Omar, "Sentiment Analysis of Social Media Networks Using Machine Learning," in 2018 14th International Computer Engineering Conference (ICENCO), Cairo, Egypt, Dec. 2018, pp. 174–176.
A. U. Rehman, A. K. Malik, B. Raza, and W. Ali, "A Hybrid CNN-LSTM Model for Improving Accuracy of Movie Reviews Sentiment Analysis," Multimedia Tools and Applications, vol. 78, no. 18, pp. 26597–26613, Sep. 2019.
S. M. Yimam, H. M. Alemayehu, A. Ayele, and C. Biemann, "Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models," in Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, Sep. 2020, pp. 1048–1060.
M. Arbane, R. Benlamri, Y. Brik, and A. D. Alahmar, "Social media-based COVID-19 sentiment classification model using Bi-LSTM," Expert Systems with Applications, vol. 212, Feb. 2023, Art. no. 118710.
R. Sanjana, C. Tandon, P. J. Bongale, T. M. Arpita, H. Palivela, and C. R. Nirmala, "Comparative Analysis of Various Language Models on Sentiment Analysis for Retail," in Soft Computing for Problem Solving, Singapore, 2021, pp. 725–739.
M. E. Basiri, S. Nemati, M. Abdar, E. Cambria, and U. R. Acharya, "ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis," Future Generation Computer Systems, vol. 115, pp. 279–294, Feb. 2021.
R. Singh, A. Kumar, and M. Ray, "Performances of Machine Learning Models and Featurization Techniques on Amazon Fine Food Reviews," in Optimization Techniques in Engineering, John Wiley & Sons, Ltd, 2023, pp. 187–199.
Stanford Network Analysis Project, “Amazon Fine Food Reviews.” [Online]. Available: https://www.kaggle.com/datasets/snap/amazon-fine-food-reviews.
M. Khader, A. Awajan, and G. Al-Naymat, "The Effects of Natural Language Processing on Big Data Analysis: Sentiment Analysis Case Study," in 2018 International Arab Conference on Information Technology (ACIT), Werdanye, Lebanon, Nov. 2018.
S. Halder, "Tokenization, Stemming and Lemmatization | TechGenizer," Mar. 16, 2021. https://techgenizer.netlify.app/blog/2021/03/16/tokenization-stemming-lemmatization/.
D. G. Kleinbaum and M. Klein, Logistic Regression. New York, NY, USA: Springer, 2010.
M. Al-Akhras, M. Alawairdhi, A. Alawairdhi, and S. Atawneh, "Using Machine Learning To Build A Classification Model For Iot Networks To Detect Attack Signatures," International Journal of Computer Networks and Communications, vol. 12, no. 6, pp. 99–116, Nov. 2020.
W. Wang, G. Chakraborty, and B. Chakraborty, "Predicting the Risk of Chronic Kidney Disease (CKD) Using Machine Learning Algorithm," Applied Sciences, vol. 11, no. 1, Jan. 2021, Art. no. 202.
T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 785–794.
J. Li, B. Wang, and H. Ding, "Lijunyi at SemEval-2020 Task 4: An ALBERT Model Based Maximum Ensemble with Different Training Sizes and Depths for Commonsense Validation and Explanation," in Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona, Spain, Sep. 2020, pp. 556–561.
E. Elgeldawi, A. Sayed, A. R. Galal, and A. M. Zaki, "Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis," Informatics, vol. 8, no. 4, Dec. 2021, Art. no. 79.
B. Ahmed, G. Ali, A. Hussain, A. Baseer, and J. Ahmed, "Analysis of Text Feature Extractors using Deep Learning on Fake News," Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 7001–7005, Apr. 2021.
Downloads
How to Cite
License
Copyright (c) 2024 Najeeb Abdulazez Alabdulkarim, Mohd Anul Haq, Jayadev Gyani
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.