Swin-Conv: A Hybrid Swin Transformer and Convolutional Network for Brain Tumor Segmentation

Khadijah; Helmie Arif Wibawa; Ragil Saputra; Rismiyati; Sandy Kurniawan; Mohd Hanafi Bin Ahmad Hijazi

doi:10.48084/etasr.18002

Authors

Khadijah Department of Informatics, Faculty of Science and Mathematics, Universitas Diponegoro, Indonesia https://orcid.org/0000-0003-4968-5808
Helmie Arif Wibawa Department of Informatics, Faculty of Science and Mathematics, Universitas Diponegoro, Indonesia https://orcid.org/0000-0003-1263-373X
Ragil Saputra Department of Informatics, Faculty of Science and Mathematics, Universitas Diponegoro, Indonesia https://orcid.org/0000-0003-2732-2037
Rismiyati Department of Informatics, Faculty of Science and Mathematics, Universitas Diponegoro, Indonesia https://orcid.org/0000-0003-0384-0083
Sandy Kurniawan Department of Informatics, Faculty of Science and Mathematics, Universitas Diponegoro, Indonesia https://orcid.org/0000-0003-0607-6949
Mohd Hanafi Bin Ahmad Hijazi Faculty of Computing and Informatics, Universiti Malaysia Sabah, Malaysia https://orcid.org/0000-0003-0431-8967

Volume: 16 | Issue: 3 | Pages: 36447-36455 | June 2026 | https://doi.org/10.48084/etasr.18002

Received: 6 February 2026 | Revised: 9 April 2026 | Accepted: 19 April 2026 | Online: 5 June 2026

Corresponding author: Khadijah

Abstract

Accurate segmentation of brain Magnetic Resonance Imaging (MRI) plays a crucial role in identifying the boundaries of abnormal regions associated with brain tumors, thereby facilitating more precise diagnosis and treatment planning. Previous studies have proposed automatic segmentation by leveraging deep learning. Convolution-based architectures are effective at spatial localization but are limited in capturing global contextual information, whereas transformer-based architectures can model long-range dependencies but often require substantial computational resources and may struggle to preserve fine-grained spatial details. To address these challenges, this research proposes Swin-Conv, a hybrid U-Net-based architecture consisting of a Swin Transformer encoder to capture the global context of an image and a convolutional decoder to preserve spatial localization during image reconstruction. Furthermore, the effectiveness of standard convolution and Mobile Inverted Bottleneck Convolution (MBConv) employed in the decoder is investigated across four Swin Transformer variants (Tiny, Small, Base, and Large). The experimental results on the public Low-Grade Glioma (LGG) MRI brain segmentation dataset demonstrate that the best performance is obtained by the Swin-Conv model with a standard convolutional decoder and the Swin-S. Comparative experiments with baseline models indicate that Swin-Conv achieves competitive performance with reasonable computational complexity. These findings highlight that Swin-Conv effectively integrates the benefits of a Swin Transformer encoder and a convolutional decoder to generate precise brain image segmentation efficiently, making it suitable for applied medical scenarios.

Keywords:

brain tumor, deep learning, image segmentation, Swin Transformer, convolutional network

References

K. C. Pasunoori, Ch. R. Prasad, and K. R. Kumar, "A systematic review on deep learning based brain tumor segmentation and detection using MRI: Past insights, present techniques and future trends," Computational Biology and Chemistry, vol. 120, Feb. 2026, Art. no. 108696.

H. Sung et al., "Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries," CA: A Cancer Journal for Clinicians, vol. 71, no. 3, pp. 209–249, 2021.

M. Pichaivel, G. Anbumani, P. Theivendren, and M. Gopal, "An Overview of Brain Tumor," in Brain Tumors, A. Agrawal, Ed. London, UK: IntechOpen, 2022.

M. Martucci et al., "Magnetic Resonance Imaging of Primary Adult Brain Tumors: State of the Art and Future Perspectives," Biomedicines, vol. 11, no. 2, Jan. 2023, Art. no. 364.

Z. Yi, L. Long, Y. Zeng, and Z. Liu, "Current Advances and Challenges in Radiomics of Brain Tumors," Frontiers in Oncology, vol. 11, Oct. 2021, Art. no. 732196.

W. Chen, B. Liu, S. Peng, J. Sun, and X. Qiao, "Computer-Aided Grading of Gliomas Combining Automatic Segmentation and Radiomics," International Journal of Biomedical Imaging, vol. 2018, no. 1, May 2018, Art. no. 2512037.

A. A. Adegun, R. O. Ogundokun, M. O. Adebiyi, and E. O. Asani, "CAD-Based Machine Learning Project for Reducing Human-Factor-Related Errors in Medical Image Analysis," in Handbook of Research on the Role of Human Factors in IT Project Management, S. Misra and A. Adewumi, Eds. Hershey, PA, USA: IGI Global Scientific Publishing, 2020, pp. 164–172.

Md. E. Rayed, S. M. S. Islam, S. I. Niha, J. R. Jim, M. M. Kabir, and M. F. Mridha, "Deep learning for medical image segmentation: State-of-the-art advancements and challenges," Informatics in Medicine Unlocked, vol. 47, Jan. 2024, Art. no. 101504.

S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, "Image Segmentation Using Deep Learning: A Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3523–3542, July 2022.

O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," in 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 2015, pp. 234–241.

N. C. Kundur, H. R. Divakar, S. Khaiyum, K. P. Rakshitha, P. M. Dhulavvagol, and A. S. Meti, "Deep Neural Networks for Precise Brain Tumor Delineation: A U-Net and TensorFlow Approach," Engineering, Technology & Applied Science Research, vol. 15, no. 3, pp. 23686–23691, June 2025.

P. Santosh Kumar, V. P. Sakthivel, M. Raju, and P. D. Sathya, "Brain tumor segmentation of the FLAIR MRI images using novel ResUnet," Biomedical Signal Processing and Control, vol. 82, Apr. 2023, Art. no. 104586.

N. Cinar, A. Ozcan, and M. Kaya, "A hybrid DenseNet121-UNet model for brain tumor segmentation from MR Images," Biomedical Signal Processing and Control, vol. 76, July 2022, Art. no. 103647.

D. Maji, P. Sigedar, and M. Singh, "Attention Res-UNet with Guided Decoder for semantic segmentation of brain tumors," Biomedical Signal Processing and Control, vol. 71, Jan. 2022, Art. no. 103077.

K. Ramalakshmi and L. Krishna Kumari, "U-Net-based architecture with attention mechanisms and Bayesian Optimization for brain tumor segmentation using MR images," Computers in Biology and Medicine, vol. 195, Sept. 2025, Art. no. 110677.

K. G. Khushubu et al., "TransUNetB: An advanced Transformer–UNet framework for efficient and explainable brain tumor segmentation," Informatics in Medicine Unlocked, vol. 59, Jan. 2025, Art. no. 101706.

A. Vaswani et al., "Attention is all you need," in Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.

A. Dosovitskiy et al., "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale," in International Conference on Learning Representations, Vienna, Austria, 2021.

S. Zheng et al., "Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers," in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 6877–6886.

D. Pan, J. Shen, Z. Al-Huda, and M. A. A. Al-qaness, "VcaNet: Vision Transformer with fusion channel and spatial attention module for 3D brain tumor segmentation," Computers in Biology and Medicine, vol. 186, Mar. 2025, Art. no. 109662.

S. Kannan, S. M, V. Balaji, and R. P. Singh, "UNet-VT: Integrating U-Net and Vision Transformers for Enhancing Brain Tumor Segmentation in MRI scans," Procedia Computer Science, vol. 258, pp. 2210–2219, Jan. 2025.

M. Zakariah, M. Al-Razgan, and T. Alfakih, "Dual vision Transformer-DSUNET with feature fusion for brain tumor segmentation," Heliyon, vol. 10, no. 18, Sept. 2024, Art. no. e37804.

Z. Liu et al., "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows," in 2021 IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021, pp. 9992–10002.

"Brain MRI segmentation." Kaggle. [Online]. Available: https://www.kaggle.com/datasets/mateuszbuda/lgg-mri-segmentation.

M. A. Mazurowski, K. Clark, N. M. Czarnek, P. Shamsesfandabadi, K. B. Peters, and A. Saha, "Radiogenomics of lower-grade glioma: algorithmically-assessed tumor shape is associated with tumor genomic subtypes and patient outcomes in a multi-institutional study with The Cancer Genome Atlas data," Journal of Neuro-Oncology, vol. 133, no. 1, pp. 27–35, May 2017.

H. Hedibi, M. Beladgham, and A. Bouida, "A combined attention mechanism for brain tumor segmentation of lower-grade glioma in magnetic resonance images," Computers in Biology and Medicine, vol. 193, July 2025, Art. no. 110380.

I. Aboussaleh, J. Riffi, K. E. Fazazy, M. A. Mahraz, and H. Tairi, "Efficient U-Net Architecture with Multiple Encoders and Attention Mechanism Decoders for Brain Tumor Segmentation," Diagnostics, vol. 13, no. 5, Feb. 2023, Art. no. 872.

P. A. Abdalla, B. A. Mohammed, and A. M. Saeed, "The impact of image augmentation techniques of MRI patients in deep transfer learning networks for brain tumor detection," Journal of Electrical Systems and Information Technology, vol. 10, no. 1, Nov. 2023, Art. no. 51.

Y. LeCun, K. Kavukcuoglu, and C. Farabet, "Convolutional networks and applications in vision," in Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, France, 2010, pp. 253–256.

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4510–4520.

M. Tan and Q. Le, "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks," in Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 2019, pp. 6105–6114.

J. Hu, L. Shen, and G. Sun, "Squeeze-and-Excitation Networks," in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132–7141.

J. Krohn, G. Beyleveld, and A. Bassens, Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence, 1st ed. Boston, MA, USA: Addison-Wesley Professional, 2019.