Domain-Adaptive Fine-Tuning of BioMedBERT for Medical Text Classification
Received: 28 June 2025 | Revised: 17 August 2025 and 9 September 2025 | Accepted: 11 September 2025 | Online: 8 December 2025
Corresponding author: Mauridhi Hery Purnomo
Abstract
Accurate classification of medical notes and texts is a critical task for improving biomedical information retrieval and decision support systems. In this study, we propose a hybrid deep learning model combining BioMedBERT with Cross-Attention and BiLSTM, aimed at enhancing the classification performance of disease-related abstracts across five categories. The proposed model was evaluated using a dataset comprising 14k annotated samples derived from scientific medical literature. The proposed architecture achieves a macro F1-score of 63.82, outperforming traditional methods such as sentence embedding models (SimCSE, SBERT), zero-shot entailment approaches, and BioBERT variants integrated with MLP classifiers. Findings show that while the model effectively distinguishes between categories such as neoplasms and cardiovascular diseases, challenges persist in classifying abstracts with overlapping semantics, particularly general pathological conditions. This research demonstrates the efficacy of combining domain-specific language models with sequence and attention mechanisms, proposing a viable method for scalable and interpretable biomedical text classification.
Keywords:
BioMedBERT, domain-adaptive fine-tuning, machine learning, medical text classification, Natural Language Processing, text classification modelsDownloads
References
H. Lu, L. Ehwerhemuepha, and C. Rakovski, “A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance,” BMC Medical Research Methodology, vol. 22, no. 1, Jul. 2022, Art. no. 181. DOI: https://doi.org/10.1186/s12874-022-01665-y
D. Kurniasari, Warsono, M. Usman, F. R. Lumbanraja, and Wamiliana, “LSTM-CNN Hybrid Model Performance Improvement with BioWordVec for Biomedical Report Big Data Classification,” Science and Technology Indonesia, vol. 9, pp. 273–283, Apr. 2024. DOI: https://doi.org/10.26554/sti.2024.9.2.273-283
J. Lee et al., “BioBERT: a pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, pp. 1234–1240, Feb. 2020. DOI: https://doi.org/10.1093/bioinformatics/btz682
P. Su and K. Vijay-Shanker, “Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction,” BMC Bioinformatics, vol. 23, Dec. 2022, Art. no. 120. DOI: https://doi.org/10.1186/s12859-022-04642-w
O. M. Alyasiri and Y.-N. Cheah, “Multi-Class Text Classification using Machine Learning Techniques,” Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 22598–22604, Jun. 2025. DOI: https://doi.org/10.48084/etasr.9994
D. Zheng, R. Han, F. Yu, and Y. Li, “Biomedical named entity recognition based on multi-cross attention feature fusion,” PLOS ONE, vol. 19, no. 5, 2024, Art. no. e0304329. DOI: https://doi.org/10.1371/journal.pone.0304329
U. Naseem, M. Khushi, V. Reddy, S. Rajendran, I. Razzak, and J. Kim, “BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition,” in 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, Jul. 2021. DOI: https://doi.org/10.1109/IJCNN52387.2021.9533884
H. Jin, C. Yao, W. Zhang, and H. Chen, “Strategic Medical Text Classification with Improved Blending Ensemble Learning,” in Artificial Intelligence and Robotics, Singapore, 2025, pp. 296–305. DOI: https://doi.org/10.1007/978-981-96-2914-5_27
K. Huang, J. Altosaar, and R. Ranganath, “ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission.” arXiv, Nov. 29, 2020.
T. Manaka, T. V. Zyl, D. Kar, and A. Wade, “Multi-step Transfer Learning in Natural Language Processing for the Health Domain,” Neural Processing Letters, vol. 56, Jun. 2024, Art. no. 177. DOI: https://doi.org/10.1007/s11063-024-11526-y
Q. Lu, A. Wen, T. Nguyen, and H. Liu, “Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records,” JMIR AI, vol. 3, no. 1, Aug. 2024, Art. no. e56932. DOI: https://doi.org/10.2196/56932
T. Schopf, D. Braun, and F. Matthes, “Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches,” in Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval, New York, NY, USA, Mar. 2023, pp. 6–15. DOI: https://doi.org/10.1145/3582768.3582795
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv, Oct. 2018.
C. P. Chai, “Comparison of text preprocessing methods,” Natural Language Engineering, vol. 29, no. 3, pp. 509–553, 2023. DOI: https://doi.org/10.1017/S1351324922000213
S. Chakraborty, E. Bisong, S. Bhatt, T. Wagner, R. Elliott, and F. Mosconi, “BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR,” in Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain (Online), Sep. 2020, pp. 669–679. DOI: https://doi.org/10.18653/v1/2020.coling-main.59
S. Madan, M. Lentzen, J. Brandt, D. Rueckert, M. Hofmann-Apitius, and H. Fröhlich, “Transformer models in biomedicine,” BMC Medical Informatics and Decision Making, vol. 24, no. 1, July 2024, Art. no. 214. DOI: https://doi.org/10.1186/s12911-024-02600-5
L. Fang, Q. Chen, C.-H. Wei, Z. Lu, and K. Wang, “Bioformer: an efficient transformer language model for biomedical text mining,” arXiv, 2023.
Z. Niu, G. Zhong, and H. Yu, “A review on the attention mechanism of deep learning,” Neurocomputing, vol. 452, pp. 48–62, 2021. DOI: https://doi.org/10.1016/j.neucom.2021.03.091
L. Feng, F. Tung, H. Hajimirsadeghi, M. O. Ahmed, Y. Bengio, and G. Mori, “Attention as an RNN,” arXiv, 2024.
A. Vaswani et al., “Attention Is All You Need.” arXiv, Aug. 02, 2023.
D. Li, L. Zhang, J. Huang, N. Xiong, L. Zhang, and J. Wan, “Enhancing zero-shot relation extraction with a dual contrastive learning framework and a cross-attention module,” Complex & Intelligent Systems, vol. 11, no. 1, Jan. 2025, Art. no. 42. DOI: https://doi.org/10.1007/s40747-024-01642-6
R. Tinn et al., “Fine-tuning large neural language models for biomedical natural language processing,” Patterns, vol. 4, no. 4, Apr. 2023, Art. no. 100729. DOI: https://doi.org/10.1016/j.patter.2023.100729
B. Nguyen and S. Ji, “Fine-Tuning Pretrained Language Models With Label Attention for Biomedical Text Classification,” arXiv.
M. Lepore, E. Plenzich, R. Tufano, R. Cerulli, and R. Maccioni, “Improving patient’s medical history classification using a feature construction approach based on situation awareness and granular computing,” Neural Computing and Applications, vol. 36, no. 35, pp. 22461–22484, Dec. 2024. DOI: https://doi.org/10.1007/s00521-024-10413-w
M. Salmi, D. Atif, D. Oliva, A. Abraham, and S. Ventura, “Handling imbalanced medical datasets: review of a decade of research,” Artificial Intelligence Review, vol. 57, no. 10, Oct. 2024, Art. no. 273. DOI: https://doi.org/10.1007/s10462-024-10884-2
M. Gheini, X. Ren, and J. May, “Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation.” arXiv, Sep. 14, 2021. DOI: https://doi.org/10.18653/v1/2021.emnlp-main.132
Q. Ding, Y. Cao, and P. Luo, “Top-Ambiguity Samples Matter: Understanding Why Deep Ensemble Works in Selective Classification,” in 37th Conference on Neural Information Processing Systems (NeurIPS 2023), 2023.
G. Xia and C.-S. Bouganis, “Augmenting the Softmax with Additional Confidence Scores for Improved Selective Classification with Out-of-Distribution Data,” International Journal of Computer Vision, vol. 132, no. 9, pp. 3714–3752, Sept. 2024. DOI: https://doi.org/10.1007/s11263-024-02029-3
Downloads
How to Cite
License
Copyright (c) 2025 Ghulam Asrofi Buntoro, Oddy Virgantara Putra, Mauridhi Hery Purnomo

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
