A Hybrid Feature Selection and Deep Neural Network Framework for Drug-Disease Association Prediction
Corresponding author: K. T. Kanyakumari
Abstract
Drug repositioning is a significant strategy for accelerating therapeutic development by identifying new clinical applications for existing approved drugs. This study presents a computational framework that integrates advanced feature selection methods with deep learning architectures to predict novel drug-disease associations with high accuracy and improved interpretability. Feature relevance was optimized using Mutual Information (MI) and Recursive Feature Elimination (RFE), facilitating the extraction of biologically significant patterns from high-dimensional data. The study employed a hybrid deep learning model that combines Convolutional Neural Networks (CNNs) with Bidirectional Long Short-Term Memory (BiLSTM) layers to effectively capture spatial and sequential dependencies within heterogeneous biomedical networks. When evaluated on benchmark datasets, the proposed framework achieved an AUC score of 0.957, outperforming existing methods. Its predictive capability was further validated through an evaluation that successfully identified known clinical associations. These results highlight the framework's potential as a robust, scalable, and generalizable tool for drug repositioning, offering promise for advancing precision medicine and translational pharmacology.
Keywords:
drug association, deep learning, convolutional neural network, bidirectional short-term memoryDownloads
References
C. P. Adams and V. V. Brantner, "Estimating the Cost of New Drug Development: Is It Really $802 Million?," Health Affairs, vol. 25, no. 2, pp. 420–428, Mar. 2006. DOI: https://doi.org/10.1377/hlthaff.25.2.420
T. T. Ashburn and K. B. Thor, "Drug Repositioning: Identifying and Developing New Uses for Existing Drugs," Nature Reviews Drug Discovery, vol. 3, no. 8, pp. 673–683, Aug. 2004. DOI: https://doi.org/10.1038/nrd1468
G. Cano et al., "Automatic Selection of Molecular Descriptors Using Random Forest: Application to Drug Discovery," Expert Systems with Applications, vol. 72, pp. 151–159, Apr. 2017. DOI: https://doi.org/10.1016/j.eswa.2016.12.008
F. Napolitano et al., "Drug Repositioning: A Machine-Learning Approach Through Data Integration," Journal of Cheminformatics, vol. 5, no. 1, Dec. 2013, Art. no. 30. DOI: https://doi.org/10.1186/1758-2946-5-30
H. Chen and J. Li, "A Flexible and Robust Multi-Source Learning Algorithm for Drug Repositioning," in Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Boston, MA, USA, Aug. 2017, pp. 510–515. DOI: https://doi.org/10.1145/3107411.3107473
X. Liang et al., "Predict and Interpret Drug–Disease Associations Based on Data Integration Using Sparse Subspace Learning," Bioinformatics, vol. 33, no. 8, pp. 1187–1196, Apr. 2017. DOI: https://doi.org/10.1093/bioinformatics/btw770
H. Liu, Y. Song, J. Guan, L. Luo, and Z. Zhuang, "Inferring New Indications for Approved Drugs via Random Walk on Drug-Disease Heterogenous Networks," BMC Bioinformatics, vol. 17, no. S17, Dec. 2016, Art. no. 539. DOI: https://doi.org/10.1186/s12859-016-1336-7
Q. Chen et al., "ILDMSF: Inferring Associations Between Long Non-Coding RNA and Disease Based on Multi-Similarity Fusion," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 18, no. 3, pp. 1106–1112, May 2021. DOI: https://doi.org/10.1109/TCBB.2019.2936476
H. Luo, M. Li, S. Wang, Q. Liu, Y. Li, and J. Wang, "Computational Drug Repositioning Using Low-Rank Matrix Approximation and Randomized Algorithms," Bioinformatics, vol. 34, no. 11, pp. 1904–1912, Jun. 2018. DOI: https://doi.org/10.1093/bioinformatics/bty013
L. Yu, J. Huang, Z. Ma, J. Zhang, Y. Zou, and L. Gao, "Inferring Drug-Disease Associations Based on Known Protein Complexes," BMC Medical Genomics, vol. 8, no. S2, Dec. 2015, Art. no. S2. DOI: https://doi.org/10.1186/1755-8794-8-S2-S2
V. Martínez, C. Navarro, C. Cano, W. Fajardo, and A. Blanco, "DrugNet: Network-Based Drug–Disease Prioritization by Integrating Heterogeneous Data," Artificial Intelligence in Medicine, vol. 63, no. 1, pp. 41–49, Jan. 2015. DOI: https://doi.org/10.1016/j.artmed.2014.11.003
H. Luo et al., "Drug Repositioning Based on Comprehensive Similarity Measures and Bi-Random Walk Algorithm," Bioinformatics, vol. 32, no. 17, pp. 2664–2671, Sep. 2016. DOI: https://doi.org/10.1093/bioinformatics/btw228
G. Fu, J. Wang, C. Domeniconi, and G. Yu, "Matrix Factorization-Based Data Fusion for the Prediction of lncRNA–Disease Associations," Bioinformatics, vol. 34, no. 9, pp. 1529–1537, May 2018. DOI: https://doi.org/10.1093/bioinformatics/btx794
W. Lan et al., "LDICDL: LncRNA-Disease Association Identification Based on Collaborative Deep Learning," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 19, no. 3, pp. 1715–1723, May 2022. DOI: https://doi.org/10.1109/TCBB.2020.3034910
Y. Liu, M. Wu, C. Miao, P. Zhao, and X.-L. Li, "Neighbourhood Regularized Logistic Matrix Factorization for Drug-Target Interaction Prediction," PLOS Computational Biology, vol. 12, no. 2, Feb. 2016, Art. no. e1004760. DOI: https://doi.org/10.1371/journal.pcbi.1004760
W. Zhang, Y. Chen, and D. Li, "Drug-Target Interaction Prediction through Label Propagation with Linear Neighbourhood Information," Molecules, vol. 22, no. 12, Nov. 2017, Art. no. 2056. DOI: https://doi.org/10.3390/molecules22122056
W. Zhang et al., "Predicting Drug-Disease Associations by Using Similarity Constrained Matrix Factorization," BMC Bioinformatics, vol. 19, no. 1, Dec. 2018, Art. no. 233. DOI: https://doi.org/10.1186/s12859-018-2220-4
T. R. Noviandy, G. M. Idroes, A. Maulana, R. P. F. Afidh, and R. Idroes, "Optimizing Hepatitis C Virus Inhibitor Identification with LightGBM and Tree-Structured Parzen Estimator Sampling," Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18810–18817, Dec. 2024. DOI: https://doi.org/10.48084/etasr.8947
J. Piñero et al., "DisGeNET: A Comprehensive Platform Integrating Information on Human Disease-Associated Genes and Variants," Nucleic Acids Research, vol. 45, no. D1, pp. 833–839, Jan. 2017. DOI: https://doi.org/10.1093/nar/gkw943
A. P. Davis et al., "Comparative Toxicogenomics Database (CTD): Update 2021," Nucleic Acids Research, vol. 49, no. D1, pp. 1138–1143, Jan. 2021. DOI: https://doi.org/10.1093/nar/gkaa891
D. S. Wishart et al., "DrugBank 5.0: A Major Update to the DrugBank Database for 2018," Nucleic Acids Research, vol. 46, no. D1, pp. 1074–1082, Jan. 2018. DOI: https://doi.org/10.1093/nar/gkx1037
L. M. Schriml et al., "Human Disease Ontology 2018 Update: Classification, Content and Workflow Expansion," Nucleic Acids Research, vol. 47, no. D1, pp. 955–962, Jan. 2019. DOI: https://doi.org/10.1093/nar/gky1032
B. Jassal et al., "The Reactome Pathway Knowledgebase," Nucleic Acids Research, Nov. 2019, Art. no. gkz1031.
J.-Y. Shi, A.-Q. Zhang, S.-W. Zhang, K.-T. Mao, and S.-M. Yiu, "A Unified Solution for Different Scenarios of Predicting Drug-Target Interactions via Triple Matrix Factorization," BMC Systems Biology, vol. 12, no. S9, pp. 498–503, Dec. 2018. DOI: https://doi.org/10.1186/s12918-018-0663-x
Downloads
How to Cite
License
Copyright (c) 2026 K. T. Kanyakumari, B. N. Veerappa

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
