A Novel Adaptive Sparse Deep Feature Selection Method for Enhanced Gene-based Cancer Classification

Authors

  • Sara Haddou Bouazza LAMIGEP EMSI, Marrakech, Morocco
Volume: 15 | Issue: 3 | Pages: 22494-22499 | June 2025 | https://doi.org/10.48084/etasr.10440

Abstract

The Adaptive Sparse Deep Feature Selection (ASDFS) method introduces a novel deep learning-based approach to enhance gene-based cancer classification. Designed to address the high dimensionality and complexity of genomic data, ASDFS leverages sparse autoencoders for dimensionality reduction and a Dual-Target Deep Neural Network (DT-DNN) to refine and identify a minimal yet biologically significant subset of genes. The method achieved outstanding classification accuracies of 99.9%, 100%, and 99.8% for ovarian, prostate, and lung cancers, respectively, achieving superior results compared to state-of-the-art techniques, including Principal Component Analysis with Grey Wolf Optimizer (PCA-GWO), Recursive Feature Elimination (RFE), and Minimum Redundancy Maximum Relevance (mRMR)-based hybrid methods. ASDFS was validated on microarray datasets with detailed characteristics: ovarian cancer (15,154 genes, 253 samples), prostate cancer (12,600 genes, 102 samples), and lung cancer (12,533 genes, 181 samples). This demonstrates its robust performance and ability to achieve significant gene reduction. Additionally, pathway enrichment analysis validated the biological relevance of the selected genes, highlighting their roles in critical cancer pathways.

Keywords:

machine learning, cancer classification, data mining, pattern recognition, feature selection

Downloads

Download data is not yet available.

References

M. Khalsan et al., "A Survey of Machine Learning Approaches Applied to Gene Expression Analysis for Cancer Prediction," IEEE Access, vol. 10, pp. 27522–27534, 2022.

S. Z. Ahammed, R. Baskar, and G. Nalinipriya, "Detection of Lung Cancer Using Multi-Stage Image Processing and Advanced Deep Learning InceptiMultiLayer-Net Model," International Journal of Intelligent Engineering and Systems, vol. 17, no. 4, pp. 714–727, Aug. 2024.

S. H. Bouazza, "A Deep Ensemble Gene Selection and Attention-guided Classification Framework for Robust Cancer Diagnosis from Microarray Data," Engineering, Technology & Applied Science Research, vol. 15, no. 1, pp. 20235–20241, Feb. 2025.

W. L. Al-Yaseen, A. Jehad, Q. A. Abed, and A. K. Idrees, "The Use of Modified K-Means Algorithm to Enhance the Performance of Support Vector Machine in Classifying Breast Cancer," International Journal of Intelligent Engineering and Systems, vol. 14, no. 2, pp. 190–200, Apr. 2021.

S. Gupta, M. K. Gupta, M. Shabaz, and A. Sharma, "Deep learning techniques for cancer classification using microarray gene expression data," Frontiers in Physiology, vol. 13, Sep. 2022, Art. no. 952709.

E. Capobianco, "High-dimensional role of AI and machine learning in cancer research," British Journal of Cancer, vol. 126, no. 4, pp. 523–532, Mar. 2022.

E. Alhenawi, R. Al-Sayyed, A. Hudaib, and S. Mirjalili, "Feature selection methods on gene expression microarray data for cancer classification: A systematic review," Computers in Biology and Medicine, vol. 140, Jan. 2022, Art. no. 105051.

M. A. Siddiqi and W. Pak, "Optimizing Filter-Based Feature Selection Method Flow for Intrusion Detection System," Electronics, vol. 9, no. 12, Dec. 2020, Art. no. 2114.

H. Liu and R. Setiono, "Feature Selection and Classification – A Probabilistic Wrapper Approach," in Industrial and Engineering Applications or Artificial Intelligence and Expert Systems, T. Tanaka, S. Ohsuga, and A. Moonis, Eds. Boca Raton, FL, USA: CRC Press, 1997, pp. 419–424.

S. H. Bouazza, "Optimized Machine Learning for Cancer Classification via Three-Stage Gene Selection," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 21093-21099, 2025.

M. A. Khan et al., "Multimodal Brain Tumor Classification Using Deep Learning and Robust Feature Selection: A Machine Learning Application for Radiologists," Diagnostics, vol. 10, no. 8, Aug. 2020, Art. no. 565.

M. Akçakaya, B. Yaman, H. Chung, and J. C. Ye, "Unsupervised Deep Learning Methods for Biological Image Reconstruction and Enhancement: An overview from a signal processing perspective," IEEE Signal Processing Magazine, vol. 39, no. 2, pp. 28–44, Mar. 2022.

H. A. Gündüz et al., "A self-supervised deep learning method for data-efficient training in genomics," Communications Biology, vol. 6, no. 1, pp. 1–12, Sep. 2023.

J. Sun and Y. Xia, "Pretreating and normalizing metabolomics data for statistical analysis," Genes & Diseases, vol. 11, no. 3, May 2024, Art. no. 100979.

M. Ligero et al., "Minimizing acquisition-related radiomics variability by image resampling and batch effect correction to allow for large-scale data analysis," European Radiology, vol. 31, no. 3, pp. 1460–1470, Mar. 2021.

M. V. Bhargavi and V. Sireesha, "A COMPARATIVE STUDY FOR STATISTICAL OUTLIER DETECTION USING COLON CANCER DATA," Advances and Applications in Statistics, vol. 72, no. 1, pp. 41–54, Jan. 2022.

E. F. Petricoin et al., "Use of proteomic patterns in serum to identify ovarian cancer," Lancet, vol. 359, no. 9306, pp. 572–577, Feb. 2002.

D. Singh et al., "Gene expression correlates of clinical prostate cancer behavior," Cancer Cell, vol. 1, no. 2, pp. 203–209, Mar. 2002.

G. J. Gordon et al., "Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and Mesothelioma," Cancer Research, vol. 62, no. 17, pp. 4963–4967, Sep. 2002.

A. Aldhahab, S. Ibrahim, and W. B. Mikhael, "Stacked Sparse Autoencoder and Softmax Classifier Framework to Classify MRI of Brain Tumor Images," International Journal of Intelligent Engineering and Systems, vol. 13, no. 3, pp. 268–279, Jun. 2020.

J. Li, M. Zhang, Q. Zhang, and D. Wang, "DT-LNS: Digital-Twin-Based Low-Risk Network Slicing Using Safe Reinforcement Learning," IEEE Internet of Things Journal, vol. 11, no. 24, pp. 39606–39625, Dec. 2024.

M. Yousef, E. Ülgen, and O. U. Sezerman, "CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis," PeerJ Computer Science, vol. 7, Feb. 2021, Art. no. e336.

H. Nguyen, V.-D. Pham, H. Nguyen, B. Tran, J. Petereit, and T. Nguyen, "CCPA: cloud-based, self-learning modules for consensus pathway analysis using GO, KEGG and Reactome," Briefings in Bioinformatics, vol. 25, no. Supplement_1, Jul. 2024, Art. no. bbae222.

R. S. Khairy, A. S. Hussein, and H. TH. S. ALRikabi, "The Detection of Counterfeit Banknotes Using Ensemble Learning Techniques of AdaBoost and Voting," International Journal of Intelligent Engineering and Systems, vol. 14, no. 1, pp. 326–339, Feb. 2021.

D. Yifan, L. Jialin, and F. Boxi, "Forecast Model of Breast Cancer Diagnosis Based on RF-AdaBoost," in 2021 International Conference on Communications, Information System and Computer Engineering, Beijing, China, 2021, pp. 716–719.

P. E. Kafrawy, H. Fathi, M. Qaraad, A. K. Kelany, and X. Chen, "An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data," IEEE Access, vol. 9, pp. 155353–155369, 2021.

A. Yaqoob, R. Musheer Aziz, and N. K. verma, "Applications and Techniques of Machine Learning in Cancer Classification: A Systematic Review," Human-Centric Intelligent Systems, vol. 3, no. 4, pp. 588–615, Dec. 2023.

E. Goceri, "Comparison of the impacts of dermoscopy image augmentation methods on skin cancer classification and a new augmentation method with wavelet packets," International Journal of Imaging Systems and Technology, vol. 33, no. 5, pp. 1727–1744, Sep. 2023.

A. T. Alhasani, H. Alkattan, A. A. Subhi, E.-S. M. El-Kenawy, and M. M. Eid, "A Comparative Analysis of Methods for Detecting and Diagnosing Breast Cancer Based on Data Mining," Journal of Artificial Intelligence and Metaheuristics, vol. 4, no. 2, pp. 08–17, 2023.

F. Hou, M. Fang, T. Xianghuan Luo, X. Fan, and Y. Guo, "Dual-Task GPR Method: Improved Generative Adversarial Clutter Suppression Network and Adaptive Target Localization Algorithm in GPR Image," IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1–13, 2024.

H. Basak, R. Kundu, S. Chakraborty, and N. Das, "Cervical Cytology Classification Using PCA and GWO Enhanced Deep Features Selection," SN Computer Science, vol. 2, no. 5, Jul. 2021, Art. no. 369.

B. Zhang, Y. Li, and Z. Chai, "A novel random multi-subspace based ReliefF for feature selection," Knowledge-Based Systems, vol. 252, Sep. 2022, Art. no. 109400.

R. K. Sachdeva, P. Bathla, P. Rani, V. Kukreja, and R. Ahuja, "A Systematic Method for Breast Cancer Classification using RFE Feature Selection," in 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering, Greater Noida, India, 2022, pp. 1673–1676.

M. Al-Rajab, J. Lu, and Q. Xu, "A framework model using multifilter feature selection to enhance colon cancer classification," Plos One, vol. 16, no. 4, Apr. 2021, Art. no. e0249094.

S. K. Prabhakar and S.-W. Lee, "An Integrated Approach for Ovarian Cancer Classification With the Application of Stochastic Optimization," IEEE Access, vol. 8, pp. 127866–127882, 2020.

N. K, H. Rajaguru, and P. Rajkumar, "Microarray Prostate Cancer Classification using Eminent Genes," in 2021 Smart Technologies, Communication and Robotics, Sathyamangalam, India, 2021, pp. 1–5.

S. K. Prabhakar and S.-W. Lee, "Transformation Based Tri-Level Feature Selection Approach Using Wavelets and Swarm Computing for Prostate Cancer Classification," IEEE Access, vol. 8, pp. 127462–127476, 2020.

M. S. Karthika, H. Rajaguru, and A. R. Nair, "Analysis of Machine Learning Classifiers for the Detection of Lung Cancer from Micro Array Gene Data," in 2023 Third International Conference on Smart Technologies, Communication and Robotics, Sathyamangalam, India, 2023, pp. 1–6.

S. K. Prabhakar, H. Rajaguru, and D.-O. Won, "A Holistic Performance Comparison for Lung Cancer Classification Using Swarm Intelligence Techniques," Journal of Healthcare Engineering, vol. 2021, no. 1, 2021, Art. no. 6680424.

Downloads

How to Cite

[1]
Bouazza, S.H. 2025. A Novel Adaptive Sparse Deep Feature Selection Method for Enhanced Gene-based Cancer Classification. Engineering, Technology & Applied Science Research. 15, 3 (Jun. 2025), 22494–22499. DOI:https://doi.org/10.48084/etasr.10440.

Metrics

Abstract Views: 24
PDF Downloads: 29

Metrics Information