Adaptive Method for Feature Selection in the Machine Learning Context

Authors

  • Yamen El Touati Department of Computer Science, Faculty of Computing and Information Technology, Northern Border University, Kingdom of Saudi Arabia
  • Jihane Ben Slimane Department of Computer Science, Faculty of Computing and Information Technology, Northern Border University, Kingdom of Saudi Arabia
  • Taoufik Saidani Department of Computer Science, Faculty of Computing and Information Technology, Northern Border University, Kingdom of Saudi Arabia
Volume: 14 | Issue: 3 | Pages: 14295-14300 | June 2024 | https://doi.org/10.48084/etasr.7401

Abstract

Feature selection is a fundamental aspect of machine learning that is crucial for improving the accuracy and efficiency of models. It carefully analyzes the abundance of data to identify the most significant characteristics, hence improving the accuracy of predictions and minimizing the likelihood of model overfitting. This technique not only optimizes model training by reducing computational requirements, but also enhances the model's interpretability, resulting in more transparent and reliable predictions. The deliberate omission of unnecessary variables is a process of improving the model and also constitutes a crucial measure toward achieving more flexible and comprehensible results in machine learning. An analysis to assess the effectiveness of feature selection on regression models was conducted. The impact was measured using Mean Squared Error (MSE) metrics. A variety of regression algorithms were evaluated, and then feature selection techniques, including statistical and algorithmic methods, such as SelectKBest, PCA, and RFE with Linear Regression and Random Forest, were applied. After selecting the features, linear models demonstrated improvements in mean squared error (MSE), highlighting the value of removing unnecessary data. This study emphasizes the subtle impact of feature selection on model performance, calling for a tailored strategy to maximize prediction accuracy.

Keywords:

cloud computing, cyber security, preventive approach, prediction techniques, artificial intelligence

Downloads

Download data is not yet available.

References

A. L’Heureux, K. Grolinger, H. F. Elyamany, and M. A. M. Capretz, "Machine Learning With Big Data: Challenges and Approaches," IEEE Access, vol. 5, pp. 7776–7797, 2017. DOI: https://doi.org/10.1109/ACCESS.2017.2696365

Y. Peng and M. H. Nagata, "An empirical overview of nonlinearity and overfitting in machine learning using COVID-19 data," Chaos, Solitons, and Fractals, vol. 139, Oct. 2020, Art. no. 110055. DOI: https://doi.org/10.1016/j.chaos.2020.110055

S. Khalid, T. Khalil, and S. Nasreen, "A survey of feature selection and feature extraction techniques in machine learning," in 2014 Science and Information Conference, London, UK, Dec. 2014, pp. 372–378. DOI: https://doi.org/10.1109/SAI.2014.6918213

S. Nuanmeesri and W. Sriurai, "Multi-Layer Perceptron Neural Network Model Development for Chili Pepper Disease Diagnosis Using Filter and Wrapper Feature Selection Methods," Engineering, Technology & Applied Science Research, vol. 11, no. 5, pp. 7714–7719, Oct. 2021. DOI: https://doi.org/10.48084/etasr.4383

K. Kira and L. A. Rendell, "A Practical Approach to Feature Selection," in Machine Learning Proceedings 1992, D. Sleeman and P. Edwards, Eds. San Francisco, CA, USA: Morgan Kaufmann, 1992, pp. 249–256. DOI: https://doi.org/10.1016/B978-1-55860-247-2.50037-1

H.-H. Hsu and C.-W. Hsieh, "Feature Selection via Correlation Coefficient Clustering," Journal of Software, vol. 5, no. 12, pp. 1371–1377, Dec. 2010. DOI: https://doi.org/10.4304/jsw.5.12.1371-1377

N. V. Kimmatkar and B. V. Babu, "Human Emotion Detection with Electroencephalography Signals and Accuracy Analysis Using Feature Fusion Techniques and a Multimodal Approach for Multiclass Classification," Engineering, Technology & Applied Science Research, vol. 12, no. 4, pp. 9012–9017, Aug. 2022. DOI: https://doi.org/10.48084/etasr.5073

F. L. da Silva, M. L. Grassi Sella, T. M. Francoy, and A. H. R. Costa, "Evaluating classification and feature selection techniques for honeybee subspecies identification using wing images," Computers and Electronics in Agriculture, vol. 114, pp. 68–77, Jun. 2015. DOI: https://doi.org/10.1016/j.compag.2015.03.012

P. More and P. Mishra, "Enhanced-PCA based Dimensionality Reduction and Feature Selection for Real-Time Network Threat Detection," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6270–6275, Oct. 2020. DOI: https://doi.org/10.48084/etasr.3801

Z. Liu, J. Yang, L. Wang, and Y. Chang, "A novel relation aware wrapper method for feature selection," Pattern Recognition, vol. 140, Aug. 2023, Art. no. 109566. DOI: https://doi.org/10.1016/j.patcog.2023.109566

D. K. Singh and M. Shrivastava, "Evolutionary Algorithm-based Feature Selection for an Intrusion Detection System," Engineering, Technology & Applied Science Research, vol. 11, no. 3, pp. 7130–7134, Jun. 2021. DOI: https://doi.org/10.48084/etasr.4149

C.-W. Chen, Y.-H. Tsai, F.-R. Chang, and W.-C. Lin, "Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results," Expert Systems, vol. 37, no. 5, 2020, Art. no. e12553. DOI: https://doi.org/10.1111/exsy.12553

R. Tibshirani, "The lasso method for variable selection in the Cox model," Statistics in Medicine, vol. 16, no. 4, pp. 385–395, Feb. 1997. DOI: https://doi.org/10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3

C. "Ann" Ratanamahatana and D. Gunopulos, "Feature selection for the naive bayesian classifier using decision trees," Applied Artificial Intelligence, vol. 17, no. 5–6, pp. 475–487, May 2003. DOI: https://doi.org/10.1080/713827175

T. Suryakanthi, "Evaluating the Impact of GINI Index and Information Gain on Classification using Decision Tree Classifier Algorithm*," International Journal of Advanced Computer Science and Applications, vol. 11, no. 2, pp. 612–619, Jan. 2020. DOI: https://doi.org/10.14569/IJACSA.2020.0110277

Y. Yuan, L. Wu, and X. Zhang, "Gini-Impurity Index Analysis," IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3154–3169, 2021. DOI: https://doi.org/10.1109/TIFS.2021.3076932

E. Shalev, "Countries_Happiness/country_statsd.csv at master · Elaishalev/Countries_Happiness," GitHub. https://github.com/Elaishalev/Countries_Happiness/blob/master/country_statsd.csv.

Downloads

How to Cite

[1]
El Touati, Y., Slimane, J.B. and Saidani , T. 2024. Adaptive Method for Feature Selection in the Machine Learning Context. Engineering, Technology & Applied Science Research. 14, 3 (Jun. 2024), 14295–14300. DOI:https://doi.org/10.48084/etasr.7401.

Metrics

Abstract Views: 474
PDF Downloads: 419

Metrics Information

Most read articles by the same author(s)

1 2 > >>