Machine Learning-Based Optimization Algorithms for Spam SMS Classification
Received: 10 September 2025 | Revised: 20 October 2025 and 2 November 2025 | Accepted: 3 November 2025 | Online: 9 February 2026
Corresponding author: Lipsa Das
Abstract
Modern communication networks are plagued by spam Short Message Service (SMS), which invade users' personal spaces and pose significant security threats. This study proposes a novel spam SMS categorization method that combines Machine Learning (ML) techniques with the Golden Search Optimizer (GSO) and the Whale Optimization Algorithm (WOA). The hybrid Golden Search Optimizer–Whale Optimization Algorithm (GSOWOA) optimization technique is integrated with Random Forest (RF) models to improve performance and convergence time. Experimental findings reveal that categorization accuracy is much higher than that of classical ML approaches. By fine-tuning model hyperparameters, the optimization process reduces both false positive and false negative classifications, enhancing overall performance. The results demonstrate an effective and systematic approach for improving spam SMS categorization systems. This strategy has been beneficial for optimally tuning hyperparameters, leading to high classification performance for models such as Extreme Gradient Boosting (XGBoost) and RF. The main experimental evaluation indicates that the proposed framework achieved a total accuracy of 98.9% and a precision of 97.85%, with a 35% faster convergence rate than conventional protocols and general metaheuristic methods. Notably, GSOWOA demonstrated a strong resistance against overfitting, coupled with computational efficiency, making it viable for real-time spam detection on edge devices. These results provide evidence of the practical benefits of implementing a hybrid optimization process to achieve high performance and optimal resource utilization for SMS filtering.
Keywords:
spam SMS, Random Forest (RF), classification, optimization algorithm, Extreme Gradient Boosting (XGBoost)Downloads
References
O. Abayomi-Alli, S. Misra, and A. Abayomi-Alli, "A deep learning method for automatic SMS spam classification: Performance of learning algorithms on indigenous dataset," Concurrency and Computation: Practice and Experience, vol. 34, no. 17, Aug. 2022, Art. no. e6989. DOI: https://doi.org/10.1002/cpe.6989
M. Raza, N. D. Jayasinghe, and M. M. A. Muslam, "A Comprehensive Review on Email Spam Classification using Machine Learning Algorithms," in 2021 International Conference on Information Networking, Jeju Island, Korea, 2021, pp. 327–332. DOI: https://doi.org/10.1109/ICOIN50884.2021.9334020
A. Ghosh and A. Senthilrajan, "Comparison of machine learning techniques for spam detection," Multimedia Tools and Applications, vol. 82, no. 19, pp. 29227–29254, Aug. 2023. DOI: https://doi.org/10.1007/s11042-023-14689-3
S. Rao, A. K. Verma, and T. Bhatia, "A review on social spam detection: Challenges, open issues, and future directions," Expert Systems with Applications, vol. 186, Dec. 2021, Art. no. 115742. DOI: https://doi.org/10.1016/j.eswa.2021.115742
A. A. Akinyelu, "Advances in spam detection for email spam, web spam, social network spam, and review spam: ML-based and nature-inspired-based techniques," Journal of Computer Security, vol. 29, no. 5, pp. 473–529, Aug. 2021. DOI: https://doi.org/10.3233/JCS-210022
M. H. Alsuwit, M. A. Haq, and M. A. Aleisa, "Advancing Email Spam Classification using Machine Learning and Deep Learning Techniques," Engineering, Technology & Applied Science Research, vol. 14, no. 4, pp. 14994–15001, Aug. 2024. DOI: https://doi.org/10.48084/etasr.7631
M. Z. Gashti, "Detection of Spam Email by Combining Harmony Search Algorithm and Decision Tree," Engineering, Technology & Applied Science Research, vol. 7, no. 3, pp. 1713–1718, June 2017. DOI: https://doi.org/10.48084/etasr.1171
M. Noroozi, H. Mohammadi, E. Efatinasab, A. Lashgari, M. Eslami, and B. Khan, "Golden Search Optimization Algorithm," IEEE Access, vol. 10, pp. 37515–37532, 2022. DOI: https://doi.org/10.1109/ACCESS.2022.3162853
R. K. Saidala and N. R. Devarakonda, "Bubble-net hunting strategy of whales based optimized feature selection for e-mail classification," in 2017 2nd International Conference for Convergence in Technology, Mumbai, India, 2017, pp. 626–631. DOI: https://doi.org/10.1109/I2CT.2017.8226205
A. Sharaff, C. Kamal, S. Porwal, S. Bhatia, K. Kaur, and M. M. Hassan, "Spam message detection using Danger theory and Krill herd optimization," Computer Networks, vol. 199, Nov. 2021, Art. no. 108453. DOI: https://doi.org/10.1016/j.comnet.2021.108453
S. Bosaeed, I. Katib, and R. Mehmood, "A Fog-Augmented Machine Learning based SMS Spam Detection and Classification System," in 2020 Fifth International Conference on Fog and Mobile Edge Computing, Paris, France, 2020, pp. 325–330. DOI: https://doi.org/10.1109/FMEC49853.2020.9144833
D. Sharma and A. Sharaff, "Identifying Spam Patterns in SMS using Genetic Programming Approach," in 2019 International Conference on Intelligent Computing and Control Systems, Madurai, India, 2019, pp. 396–400. DOI: https://doi.org/10.1109/ICCS45141.2019.9065686
A. A. Al-Hasan and E.-S. M. El-Alfy, "Dendritic Cell Algorithm for Mobile Phone Spam Filtering," Procedia Computer Science, vol. 52, pp. 244–251, Jan. 2015. DOI: https://doi.org/10.1016/j.procs.2015.05.067
A. S. Onashoga, O. O. Abayomi-Alli, A. S. Sodiya, and D. A. Ojo, "An Adaptive and Collaborative Server-Side SMS Spam Filtering Scheme Using Artificial Immune System," Information Security Journal: A Global Perspective, vol. 24, no. 4–6, pp. 133–145, Dec. 2015. DOI: https://doi.org/10.1080/19393555.2015.1078017
T. M. Mahmoud and A. M. Mahfouz, "SMS Spam Filtering Technique Based on Artificial Immune System," International Journal of Computer Science Issues, vol. 9, no. 2, pp. 589–597, Mar. 2012.
A. Alzahrani and D. B. Rawat, "Comparative Study of Machine Learning Algorithms for SMS Spam Detection," in 2019 SoutheastCon, Huntsville, AL, USA, 2019, pp. 1–6. DOI: https://doi.org/10.1109/SoutheastCon42311.2019.9020530
A. Theodorus, T. K. Prasetyo, R. Hartono, and D. Suhartono, "Short Message Service (SMS) Spam Filtering using Machine Learning in Bahasa Indonesia," in 2021 3rd East Indonesia Conference on Computer and Information Technology, Surabaya, Indonesia, 2021, pp. 199–203. DOI: https://doi.org/10.1109/EIConCIT50028.2021.9431859
D. S. Sisodia and A. K. Yogi, "Performance Evaluation of Ensemble Learners on Smartphone Sensor Generated Human Activity Data Set," in Data, Engineering and Applications: Volume 2, R. K. Shukla, J. Agrawal, S. Sharma, and G. Singh Tomer, Eds. Singapore: Springer, 2019, pp. 277–284. DOI: https://doi.org/10.1007/978-981-13-6351-1_22
T. Xia and X. Chen, "A weighted feature enhanced Hidden Markov Model for spam SMS filtering," Neurocomputing, vol. 444, pp. 48–58, July 2021. DOI: https://doi.org/10.1016/j.neucom.2021.02.075
P. K. Roy, J. P. Singh, and S. Banerjee, "Deep learning to filter SMS Spam," Future Generation Computer Systems, vol. 102, pp. 524–533, Jan. 2020. DOI: https://doi.org/10.1016/j.future.2019.09.001
S. Ouhame, Y. Hadi, and A. Ullah, "An efficient forecasting approach for resource utilization in cloud data center using CNN-LSTM model," Neural Computing and Applications, vol. 33, no. 16, pp. 10043–10055, Aug. 2021. DOI: https://doi.org/10.1007/s00521-021-05770-9
A. Chandra and S. K. Khatri, "Spam SMS Filtering using Recurrent Neural Network and Long Short Term Memory," in 2019 4th International Conference on Information Systems and Computer Networks, Mathura, India, 2019, pp. 118–122. DOI: https://doi.org/10.1109/ISCON47742.2019.9036269
T. O. Omotehinwa and D. O. Oyewola, "Hyperparameter Optimization of Ensemble Models for Spam Email Detection," Applied Sciences, vol. 13, no. 3, Feb. 2023, Art. no. 1971. DOI: https://doi.org/10.3390/app13031971
P. Manasa et al., "Tweet Spam Detection Using Machine Learning and Swarm Optimization Techniques," IEEE Transactions on Computational Social Systems, vol. 11, no. 4, pp. 4870–4877, Aug. 2024. DOI: https://doi.org/10.1109/TCSS.2022.3230823
S. Bazzaz Abkenar, E. Mahdipour, S. M. Jameii, and M. Haghi Kashani, "A hybrid classification method for Twitter spam detection based on differential evolution and random forest," Concurrency and Computation: Practice and Experience, vol. 33, no. 21, Nov. 2021, Art. no. e6381. DOI: https://doi.org/10.1002/cpe.6381
K. Agarwal and T. Kumar, "Email Spam Detection Using Integrated Approach of Naïve Bayes and Particle Swarm Optimization," in 2018 Second International Conference on Intelligent Computing and Control Systems, Madurai, India, 2018, pp. 685–690. DOI: https://doi.org/10.1109/ICCONS.2018.8662957
U. Srinivasarao and A. Sharaff, "SMS sentiment classification using an evolutionary optimization based fuzzy recurrent neural network," Multimedia Tools and Applications, vol. 82, no. 27, pp. 42207–42238, Nov. 2023. DOI: https://doi.org/10.1007/s11042-023-15206-2
Y.-J. Su, C.-H. Chen, T.-Y. Chen, and C.-C. Cheng, "Chinese Microblog Sentiment Analysis by Adding Emoticons to Attention-Based CNN," Journal of Internet Technology, vol. 21, no. 3, pp. 821–829, May 2020.
T. Sharmin, F. Di Troia, K. Potika, and M. Stamp, "Convolutional neural networks for image spam detection," Information Security Journal: A Global Perspective, vol. 29, no. 3, pp. 103–117, May 2020. DOI: https://doi.org/10.1080/19393555.2020.1722867
M. Salman, M. Ikram, N. Basta, and M. A. Kaafar, "SpaLLM-Guard: Pairing SMS Spam Detection Using Open-source and Commercial LLMs." arXiv, Jan. 09, 2025.
M. Salman, M. Ikram, and M. A. Kaafar, "Investigating Evasive Techniques in SMS Spam Filtering: A Comparative Analysis of Machine Learning Models," IEEE Access, vol. 12, pp. 24306–24324, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3364671
M. A. Abid, S. Ullah, M. A. Siddique, M. F. Mushtaq, W. Aljedaani, and F. Rustam, "Spam SMS filtering based on text features and supervised machine learning techniques," Multimedia Tools and Applications, vol. 81, no. 28, pp. 39853–39871, Nov. 2022. DOI: https://doi.org/10.1007/s11042-022-12991-0
S. B. S. Ahmad, M. Rafie, and S. M. Ghorabie, "Spam detection on Twitter using a support vector machine and users’ features by identifying their interactions," Multimedia Tools and Applications, vol. 80, no. 8, pp. 11583–11605, Mar. 2021. DOI: https://doi.org/10.1007/s11042-020-10405-7
L. P. Lim and M. Mahinderjit Singh, "Resolving the imbalance issue in short messaging service spam dataset using cost-sensitive techniques," Journal of Information Security and Applications, vol. 54, Oct. 2020, Art. no. 102558. DOI: https://doi.org/10.1016/j.jisa.2020.102558
N. N. Amir Sjarif, N. F. Mohd Azmi, S. Chuprat, H. M. Sarkan, Y. Yahya, and S. M. Sam, "SMS Spam Message Detection using Term Frequency-Inverse Document Frequency and Random Forest Algorithm," Procedia Computer Science, vol. 161, pp. 509–515, Jan. 2019. DOI: https://doi.org/10.1016/j.procs.2019.11.150
T. A. Almeida, J. M. G. Hidalgo, and A. Yamakami, "Contributions to the study of SMS spam filtering: new collection and results," in Proceedings of the 11th ACM symposium on Document engineering, Mountain View, CA, USA, 2011, pp. 259–262. DOI: https://doi.org/10.1145/2034691.2034742
"SMS Spam Collection Dataset." Kaggle. [Online]. Available: https://www.kaggle.com/datasets/uciml/sms-spam-collection-dataset.
T. M. Oshiro, P. S. Perez, and J. A. Baranauskas, "How Many Trees in a Random Forest?," in Machine Learning and Data Mining in Pattern Recognition: 8th International Conference, Berlin, Germany, 2012, pp. 154–168. DOI: https://doi.org/10.1007/978-3-642-31537-4_13
Y. Fujiwara, Y. Ida, S. Kanai, A. Kumagai, J. Arai, and N. Ueda, "Fast Random Forest Algorithm via Incremental Upper Bound," in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 2019, pp. 2205–2208. DOI: https://doi.org/10.1145/3357384.3358092
S. Mirjalili and A. Lewis, "The Whale Optimization Algorithm," Advances in Engineering Software, vol. 95, pp. 51–67, May 2016. DOI: https://doi.org/10.1016/j.advengsoft.2016.01.008
Downloads
How to Cite
License
Copyright (c) 2026 Lipsa Das, Laxmi Ahuja, Adesh Pandey

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
