The Fisher Component-based Feature Selection Method
Received: 11 June 2022 | Revised: 23 June 2022 | Accepted: 26 June 2022 | Online: 15 July 2022
Corresponding author: A. B. Buriro
Abstract
A feature selection technique is proposed in this paper, which combines the computational ease of filters and the performance superiority of wrappers. The technique sequentially combines Fisher-score-based ranking and logistic regression-based wrapping. On synthetically generated data, the 5-fold cross-validation performances of the proposed technique were compatible with the performances achieved through Least Absolute Shrinkage and Selection Operator (LASSO). The binary classification performances in terms of F1 score and Geometric Mean (GM) were evaluated over a varying imbalance ratio of 0.1:0.9 – 0.5:0.5, a number of informative features of 1 – 30, and a fixed sample size of 5000.
Keywords:
Feature selection, regularization, class imbalance, dimensionality reductionDownloads
References
I. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, vol. 3, pp. 1157–1182, 2003.
G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning with Applications in R, New York, NY, USA: Springer.
S. Nuanmeesri and W. Sriurai, "Thai Water Buffalo Disease Analysis with the Application of Feature Selection Technique and Multi-Layer Perceptron Neural Network," Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 6907–6911, Apr. 2021. DOI: https://doi.org/10.48084/etasr.4049
S. Matharaarachchi, M. Domaratzki, and S. Muthukumarana, "Assessing feature selection method performance with class imbalance data," Machine Learning with Applications, vol. 6, Dec. 2021, Art. no. 100170. DOI: https://doi.org/10.1016/j.mlwa.2021.100170
D. K. Singh and M. Shrivastava, "Evolutionary Algorithm-based Feature Selection for an Intrusion Detection System," Engineering, Technology & Applied Science Research, vol. 11, no. 3, pp. 7130–7134, Jun. 2021. DOI: https://doi.org/10.48084/etasr.4149
Y. Saeys, I. Inza, and P. Larrañaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507–2517, Oct. 2007. DOI: https://doi.org/10.1093/bioinformatics/btm344
Q. Gu, Z. Li, and J. Han, "Generalized Fisher score for feature selection," in Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, Arlington, VA, USA, Apr. 2011, pp. 266–273.
E. Barshan, A. Ghodsi, Z. Azimifar, and M. Zolghadri Jahromi, "Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds," Pattern Recognition, vol. 44, no. 7, pp. 1357–1371, Jul. 2011. DOI: https://doi.org/10.1016/j.patcog.2010.12.015
P. More and P. Mishra, "Enhanced-PCA based Dimensionality Reduction and Feature Selection for Real-Time Network Threat Detection," Engineering, Technology & Applied Science Research, vol. 10, no. 5, pp. 6270–6275, Oct. 2020. DOI: https://doi.org/10.48084/etasr.3801
J. Gong and H. Kim, "RHSBoost: Improving classification performance in imbalance data," Computational Statistics & Data Analysis, vol. 111, pp. 1–13, Jul. 2017. DOI: https://doi.org/10.1016/j.csda.2017.01.005
F. Pedregosa et al., "Scikit-learn: Machine Learning in Python," The Journal of Machine Learning Research, vol. 12, pp. 2825–2830, Aug. 2011.
R. Muthukrishnan and R. Rohini, "LASSO: A feature selection technique in predictive modeling for machine learning," in 2016 IEEE International Conference on Advances in Computer Applications (ICACA), Coimbatore, India, Jul. 2016, pp. 18–20. DOI: https://doi.org/10.1109/ICACA.2016.7887916
A. B. Musa, "A comparison of ℓ1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression," International Journal of Machine Learning and Cybernetics, vol. 5, no. 6, pp. 861–873, Dec. 2014. DOI: https://doi.org/10.1007/s13042-013-0171-7
W.-J. Lin and J. J. Chen, "Class-imbalanced classifiers for high-dimensional data," Briefings in Bioinformatics, vol. 14, no. 1, pp. 13–26, Jan. 2013. DOI: https://doi.org/10.1093/bib/bbs006
Downloads
How to Cite
License
Copyright (c) 2022 A. B. Buriro, S. Kumar
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.