Breast Cancer Diagnosis Using Supervised Machine Learning for Benign and Malignant Classification
Received: 17 April 2025 | Revised: 10 May 2025 and 27 May 2025 | Accepted: 31 May 2025 | Online: 4 July 2025
Corresponding author: G. Naganandini
Abstract
This study investigates the application of supervised machine learning for classifying breast tumors as benign or malignant, leveraging the Breast Cancer Wisconsin Dataset. The proposed method encompasses a comprehensive pipeline, beginning with data preprocessing to address missing values and ensure feature normalization. Exploratory Data Analysis (EDA) techniques are employed to uncover patterns and relationships within the data. To enhance model performance, feature selection is performed using various techniques, including correlation-based selection, tree-based methods, and Recursive Feature Elimination with Cross-Validation (RFECV). Machine learning algorithms, including Random Forest (RF), SVM, Logistic Regression (LR), and Gradient Boosting (GB), were trained on the selected features. Hyperparameter tuning was performed using grid and randomized search to optimize model accuracy. The results demonstrate the effectiveness of the proposed method, achieving significant improvements in classification metrics such as precision, recall, F1-score, and ROC-AUC. These findings underscore the potential of machine learning to enhance diagnostic accuracy and reliability, offering a scalable, efficient, and robust approach to breast cancer diagnosis. This work paves the way for future integration of advanced techniques, including deep learning models and larger datasets, to further improve diagnostic outcomes and accessibility.
Keywords:
breast cancer Wisconsin dataset, EDA, Recursive Feature Elimination with Cross-Validation (RFECV), precision, recall, F1-score, ROC-AUCDownloads
References
C. G. Yedjou, S. S. Tchounwou, R. A. Aló, R. Elhag, B. Mochona, and L. Latinwo, "Application of Machine Learning Algorithms in Breast Cancer Diagnosis and Classification," International Journal of Science Academic Research, vol. 2, no. 1, pp. 3081–3086, Jan. 2021.
W. C. Shia, L. S. Lin, and D. R. Chen, "Classification of malignant tumours in breast ultrasound using unsupervised machine learning approaches," Scientific Reports, vol. 11, no. 1, Jan. 2021, Art. no. 1418.
N. Al-Azzam and I. Shatnawi, "Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer," Annals of Medicine and Surgery, vol. 62, pp. 53–64, Feb. 2021.
M. S. Gharibdousti, S. M. Haider, D. Ouedraogo, and S. Lu, "Breast cancer diagnosis using feature extraction techniques with supervised and unsupervised classification algorithms," Applied Medical Informatics, vol. 41, no. 1, pp. 40–52, Mar. 2019.
Z. Q. Fanoos and J. Abdulhadi, "Breast cancer diagnosis using supervised machine learning classification algorithms," presented at the 2nd International Conference on Applied Research and Engineering (ICARAE2022), Cape Town, South Africa, 2023.
P. Dadheech, V. Kalmani, S. R. Dogiwal, V. K. Sharma, A. Kumar, and S. K. Pandey, "Breast cancer prediction using supervised machine learning techniques," Journal of Information and Optimization Sciences, vol. 44, no. 3, pp. 383–392, 2023.
K. M. M. Uddin, N. Biswas, S. T. Rikta, and S. K. Dey, "Machine learning-based diagnosis of breast cancer utilizing feature optimization technique," Computer Methods and Programs in Biomedicine Update, vol. 3, 2023, Art. no. 100098.
V. Chaurasia and S. Pal, "Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer," SN Computer Science, vol. 1, no. 5, Sep. 2020, Art. no. 270.
O. M. William Wolberg, "Breast Cancer Wisconsin (Diagnostic)." UCI Machine Learning Repository, 1993.
O. Ginsburg et al., "Breast cancer early detection: A phased approach to implementation," Cancer, vol. 126, no. S10, pp. 2379–2393, May 2020.
A. E. Kilic and M. Karakoyun, "Breast Cancer Detection Using Machine Learning Algorithms," presented at the 2nd International Conference on Scientific and Academic Research, Konya, Turkey, Mar. 2023.
S. Devi, R. Kaul Ghanekar, J. Pande, D. Dumbre, R. Chavan, and H. Gupta, "Prediction and Diagnosis of Breast Cancer Using Machine and Modern Deep Learning Models," Asian Pacific Journal of Cancer Prevention, vol. 25, no. 3, pp. 1077–1085, Mar. 2024.
R. S. P. Priya and P. S. Vadivu, "Bio-inspired ensemble feature selection (biefs) and kernel extreme learning machine classifier for breast cancer diagnosis," International Journal of Health Sciences, pp. 1404–1429, Jun. 2022.
S. Sakib, N. Yasmin, A. K. Tanzeem, F. Shorna, K. Md. Hasib, and S. B. Alam, "Breast Cancer Detection and Classification: A Comparative Analysis Using Machine Learning Algorithms," in Proceedings of Third International Conference on Communication, Computing and Electronics Systems, vol. 844, V. Bindhu, J. M. R. S. Tavares, and K. L. Du, Eds. Springer Singapore, 2022, pp. 703–717.
A. Bekkouche, M. Merzoug, M. Hadjila, and W. Ferhi, "Towards Early Breast Cancer Detection: A Deep Learning Approach," Engineering, Technology & Applied Science Research, vol. 14, no. 5, pp. 17517–17523, Oct. 2024.
S. Iranmakani et al., "A review of various modalities in breast imaging: technical aspects and clinical outcomes," Egyptian Journal of Radiology and Nuclear Medicine, vol. 51, no. 1, Dec. 2020, Art. no. 57.
T. Islam et al., "Predictive modeling for breast cancer classification in the context of Bangladeshi patients by use of machine learning approach with explainable AI," Scientific Reports, vol. 14, no. 1, Apr. 2024, Art. no. 8487.
Y. Guo et al., "Machine learning and new insights for breast cancer diagnosis," Journal of International Medical Research, vol. 52, no. 4, Apr. 2024, Art. no. 03000605241237867.
M. Gupta and B. Gupta, "A Comparative Study of Breast Cancer Diagnosis Using Supervised Machine Learning Techniques," in 2018 Second International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, Feb. 2018, pp. 997–1002.
R. Rabiei, "Prediction of Breast Cancer using Machine Learning Approaches," Journal of Biomedical Physics and Engineering, vol. 12, no. 3, Jul. 2022.
M. D. Ali et al., "Breast Cancer Classification through Meta-Learning Ensemble Technique Using Convolution Neural Networks," Diagnostics, vol. 13, no. 13, Jun. 2023, Art. no. 2242.
T. R. Mahesh et al., "Transformative Breast Cancer Diagnosis using CNNs with Optimized ReduceLROnPlateau and Early Stopping Enhancements," International Journal of Computational Intelligence Systems, vol. 17, no. 1, Jan. 2024, Art. no. 14.
H. Zhou, D. Ren, H. Xia, M. Fan, X. Yang, and H. Huang, "AST-GNN: An attention-based spatio-temporal graph neural network for Interaction-aware pedestrian trajectory prediction," Neurocomputing, vol. 445, pp. 298–308, Jul. 2021.
M. Fan, X. Zhang, J. Hu, N. Gu, and D. Tao, "Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection," IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 10, pp. 5859–5872, Jul. 2022.
D. Zou et al., "Deep Field Relation Neural Network for click-through rate prediction," Information Sciences, vol. 577, pp. 128–139, Oct. 2021.
P. Zhou, J. Chen, M. Fan, L. Du, Y. D. Shen, and X. Li, "Unsupervised feature selection for balanced clustering," Knowledge-Based Systems, vol. 193, Apr. 2020, Art. no. 105417.
M. A. Navarro et al., "An improved multi-population whale optimization algorithm," International Journal of Machine Learning and Cybernetics, vol. 13, no. 9, pp. 2447–2478, Sep. 2022.
Downloads
How to Cite
License
Copyright (c) 2025 G. Naganandini, Vishwanath R. Hulipalled

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.