An Enhanced Random Forest (ERF)-based Machine Learning Framework for Resampling, Prediction, and Classification of Mobile Applications using Textual Features

Shahbaz Hussain; Nadeem Sarwar; Arshad Ali; Hamayun Khan; Irfanud Din; Abdullah M. Alqahtani; Mohamed Shabir; Aitizaz Ali

doi:10.48084/etasr.9148

Authors

Shahbaz Hussain Department of Computer Science, Faculty of Computer Science & IT, Superior University, Lahore, Pakistan
Nadeem Sarwar Department of Computer Science, Bahria University, Lahore Campus, Lahore, Pakistan
Arshad Ali Faculty of Computer and Information Systems, Islamic University of Madinah, Al Madinah Al Munawarah, Saudi Arabia
Hamayun Khan Department of Computer Science, Faculty of Computer Science & IT, Superior University, Lahore, Pakistan
Irfanud Din Department of Computer Science, New Uzbekistan University, Tashkent, Uzbekistan
Abdullah M. Alqahtani College of Engineering & Computer Science, Department of Electrical & Electronic Engineering, Jazan University, Saudi Arabia
Mohamed Shabir Network Security Forensic Group, School of Technology, Asia Pacifc University, Malaysia
Aitizaz Ali Network Security Forensic Group, School of Technology, Asia Pacifc University, Malaysia

Volume: 15 | Issue: 1 | Pages: 19776-19781 | February 2025 | https://doi.org/10.48084/etasr.9148

Received: 1 October 2024 | Revised: 1 November 2024 and 14 November 2024 | Accepted: 5 December 2024 | Online: 13 December 2024

Corresponding author: Aitizaz Ali

Abstract

The amount of mobile applications is increasing rapidly, and it is difficult for software developers to identify the numerous key factors that affect their rating and performance. This study presents a machine-learning framework to improve decisions in adding new features to mobile applications and enhancing overall performance. A dataset of app attributes from the Apple AppStore was used, exploiting NLP techniques to preprocess the textual information and develop an Enhanced Random Forest (ERF) framework to assess and forecast ratings for multifunctional apps and investigate the connections between features and user ratings. The ERF model was compared with other renowned ML methods including Decision Trees (DT), Naive Bayes (NB), CNN, and ANN. The experimental results showed that the proposed model predicts app ratings more effectively compared to other complex models. The proposed model achieved precision, recall, and F1-score of 92.76%, 99.33%, and 95.93%, respectively.

Keywords:

machine learning, reliability, mobile applications, sustainable learning, predicting mobile app ratings, user ratings, XGBoost, random forest, NLP high-dimensional datasets, convolutional neural network (CNN), NSL-KDD, UNSW-NB15, mean square error

Downloads

Download data is not yet available.

References

P. M. Dhulavvagol and S. G. Totad, "Performance Enhancement of Distributed Processing Systems Using Novel Hybrid Shard Selection Algorithm," Engineering, Technology & Applied Science Research, vol. 14, no. 2, pp. 13720–13725, Apr. 2024.

J. Song, J. Kim, D. R. Jones, J. Baker, and W. W. Chin, "Application discoverability and user satisfaction in mobile application stores: An environmental psychology perspective," Decision Support Systems, vol. 59, pp. 37–51, Mar. 2014.

C. Z. Liu, Y. A. Au, and H. S. Choi, "Effects of Freemium Strategy in the Mobile App Market: An Empirical Study of Google Play," Journal of Management Information Systems, vol. 31, no. 3, pp. 326–354, Jul. 2014.

A. Ghose and S. P. Han, "An Empirical Analysis of User Content Generation and Usage Behavior on the Mobile Internet," Management Science, vol. 57, no. 9, pp. 1671–1691, Sep. 2011.

D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression. John Wiley & Sons, 2013.

L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp. 5–32, Oct. 2001.

T. Chen and C. Guestrin, "XGBoost: A Scalable Tree Boosting System," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 785–794.

V. Balakrishnan, Z. Shi, C. L. Law, R. Lim, L. L. Teh, and Y. Fan, "A deep learning approach in predicting products’ sentiment ratings: a comparative analysis," The Journal of Supercomputing, vol. 78, no. 5, pp. 7206–7226, Apr. 2022.

M. R. Dehkordi, H. Seifzadeh, G. Beydoun, and M. H. Nadimi-Shahraki, "Success prediction of android applications in a novel repository using neural networks," Complex & Intelligent Systems, vol. 6, no. 3, pp. 573–590, Oct. 2020.

X. Wu and Y. Zhu, "A Hybrid Approach Based on Collaborative Filtering to Recommending Mobile Apps," in 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), Wuhan, China, Dec. 2016, pp. 8–15.

B. Pang and L. Lee, "Opinion Mining and Sentiment Analysis," Foundations and Trends in Information Retrieval, vol. 2, no. 1–2, pp. 1–135, Jul. 2008.

M. C. Chiu, J. H. Huang, S. Gupta, and G. Akman, "Developing a personalized recommendation system in a smart product service system based on unsupervised learning model," Computers in Industry, vol. 128, Jun. 2021, Art. no. 103421.

N. Jindal and B. Liu, "Opinion spam and analysis," in Proceedings of the international conference on Web search and web data mining - WSDM ’08, Palo Alto, CA, USA, 2008, pp. 219-230.

M. T. Ribeiro, S. Singh, and C. Guestrin, "‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier," in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, Aug. 2016, pp. 1135–1144.

S. M. Mudambi and D. Schuff, "Research Note: What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon.com," MIS Quarterly, vol. 34, no. 1, pp. 185–200, 2010.

G. Prakash, "Apple AppStore Apps." [Online]. Available: https://www.kaggle.com/datasets/gauthamp10/apple-appstore-apps.

Vol. 15 (2025)	Vol. 7 (2017)
Vol. 14 (2024)	Vol. 6 (2016)
Vol. 13 (2023)	Vol. 5 (2015)
Vol. 12 (2022)	Vol. 4 (2014)
Vol. 11 (2021)	Vol. 3 (2013)
Vol. 10 (2020)	Vol. 2 (2012)
Vol. 9 (2019)	Vol. 1 (2011)
Vol. 8 (2018)

An Enhanced Random Forest (ERF)-based Machine Learning Framework for Resampling, Prediction, and Classification of Mobile Applications using Textual Features

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License

Most read articles by the same author(s)

An Efficient Optimization System for Early Breast Cancer Diagnosis based on Internet of Medical Things and Deep Learning

A Machine Learning Approach to Reduce Latency in Edge Computing for IoT Devices

An Enhanced Convolutional Neural Network (CNN) based P-EDR Mechanism for Diagnosis of Diabetic Retinopathy (DR) using Machine Learning

A Deep Learning-based Architecture for Diabetes Detection, Prediction, and Classification

Leveraging Convolutional Neural Network (CNN)-based Auto Encoders for Enhanced Anomaly Detection in High-Dimensional Datasets

A Quantum Encryption Algorithm based on the Rail Fence Mechanism to Provide Data Integrity