NetPhish-Mix: A Multi-Modal Phishing Detection Method Utilizing URL Graphs and Page Screenshot Vision Transformer

Authors

  • Awwab Mohammad Department of Computer Science and Technology, Manav Rachna University, Faridabad, India
  • N. Praveen Department of Computer Science and Engineering, BMS College of Engineering, Bengaluru, India
  • Pandiarajan S Department of Computer Science and Engineering, KIT–Kalaignarkarunanidhi Institute of Technology, Coimbatore, India
  • R. Shreeshayana Department of Electrical and Electronics Engineering, ATME College of Engineering, Mysuru, India
  • Samudrala Jagadeesh Department of Electronics and Communication Engineering, Aditya University, Andhra Pradesh, India
  • Anjali Raj Department of Personnel Management & Industrial Relations, Patna University, Patna, India
  • Basavaraj Patil School of Computer Science and Engineering, RV University, Bengaluru, India
  • Yogesh H. Bhosale Department of Computer Science and Engineering, CSMSS Chh. Shahu College of Engineering, Chhatrapati Sambhajinagar (Aurangabad), MH, India
  • Sanjana M. Nagaraj Department of Computer Science and Business Systems, Dayananda Sagar College of Engineering, Bengaluru, India
  • D. Anil Department of Computer Science and Business Systems, Dayananda Sagar College of Engineering, Bengaluru, India
Volume: 16 | Issue: 1 | Pages: 31209-31214 | February 2026 | https://doi.org/10.48084/etasr.15759

Abstract

In order to avoid detection, modern phishing assaults use techniques such as dynamic HTML obfuscation and site mimicry. This study presents NetPhish-Mix, a powerful framework for detecting phishing attempts by combining the analysis of website structure, content, and design. The framework's HGT records the structural relationships between the various components, including domain, URL, and DOM nodes. In addition, the framework can retrieve visual semantics from page screenshots while considering the page layout thanks to a ViT. After temperature calibration, a gated late-fusion procedure modifies the contributions of both modalities as needed to generate trustworthy confidence estimations. The results show significant generalization when tested on distinct and previously withheld datasets, achieving an F1-score of 0.977, an ROC-AUC of 0.997, and a less than 1% false positive rate. The proposed NetPhish-Mix model consistently makes the correct decision in tests that include URL homoglyphs, additional characters in subdomains, and poor images, making it a reliable, easy-to-understand, and practical security automation solution that can identify and stop phishing attempts.

Keywords:

multimodal learning, phishing detection, heterogeneous graph transformer, vision transformer, gated fusion, robust cybersecurity

Downloads

Download data is not yet available.

References

A. Ejaz, A. N. Mian, and S. Manzoor, ''Life-long phishing attack detection using continual learning,'' Scientific Reports, vol. 13, no. 1, July 2023, Art. no. 11488. DOI: https://doi.org/10.1038/s41598-023-37552-9

T. Bilot, N. E. Madhoun, K. A. Agha, and A. Zouaoui, ''Graph Neural Networks for Intrusion Detection: A Survey,'' IEEE Access, vol. 11, pp. 49114–49139, 2023. DOI: https://doi.org/10.1109/ACCESS.2023.3275789

S. Remya, M. J. Pillai, B. S. Aparna, S. R. Subbareddy, and Y. Y. Cho, ''BGL-PhishNet: Phishing Website Detection Using Hybrid Model-BERT, GNN, and LightGBM,'' IEEE Access, vol. 13, pp. 47552–47569, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3551542

F. Ji et al., ''Evaluating the Effectiveness and Robustness of Visual Similarity-based Phishing Detection Models,'' presented at the 34th USENIX Security Symposium (USENIX Security 25), Seattle, WA, USA, 2025, pp. 3201–3220.

P. Maneriker, J. W. Stokes, E. G. Lazo, D. Carutasu, F. Tajaddodianfar, and A. Gururajan, ''URLTran: Improving Phishing URL Detection Using Transformers,'' in MILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM), San Diego, CA, USA, Nov. 2021, pp. 197–204. DOI: https://doi.org/10.1109/MILCOM52596.2021.9653028

L. Ouyang and Y. Zhang, ''Phishing Web Page Detection with HTML-Level Graph Neural Network,'' in 2021 IEEE 20th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Shenyang, China, Oct. 2021, pp. 952–958. DOI: https://doi.org/10.1109/TrustCom53373.2021.00133

T. Bilot, G. Geis, and B. Hammi, ''PhishGNN: A Phishing Website Detection Framework using Graph Neural Networks,'' in Proceedings of the 19th International Conference on Security and Cryptography, Lisbon, Portugal, 2022, pp. 428–435. DOI: https://doi.org/10.5220/0011328600003283

J. H. Yoon, S. J. Buu, and H. J. Kim, ''Phishing Webpage Detection via Multi-Modal Integration of HTML DOM Graphs and URL Features Based on Graph Convolutional and Transformer Networks,'' Electronics, vol. 13, no. 16, Aug. 2024, Art. no. 3344. DOI: https://doi.org/10.3390/electronics13163344

M. Murhej and G. Nallasivan, ''Multimodal framework for phishing attack detection and mitigation through behavior analysis using EM-BERT and SPCA-BASED EAI-SC-LSTM,'' Frontiers in Communications and Networks, vol. 6, July 2025, Art. no. 1587654. DOI: https://doi.org/10.3389/frcmn.2025.1587654

P. Li, Y. Xie, X. Xu, J. Zhou, and Q. Xuan, ''Phishing Fraud Detection on Ethereum Using Graph Neural Network,'' in Blockchain and Trustworthy Systems, vol. 1679, D. Svetinovic, Y. Zhang, X. Luo, X. Huang, and X. Chen, Springer Nature Singapore, 2022, pp. 362–375. DOI: https://doi.org/10.1007/978-981-19-8043-5_26

Z. Sheng, L. Song, and Y. Wang, ''Dynamic Feature Fusion: Combining Global Graph Structures and Local Semantics for Blockchain Phishing Detection,'' IEEE Transactions on Network and Service Management, vol. 22, no. 5, pp. 4706–4718, Oct. 2025. DOI: https://doi.org/10.1109/TNSM.2025.3576130

W. Li, S. Manickam, and Y. W. Chong, ''FedPhishLLM: A privacy-preserving and explainable phishing detection mechanism using federated learning and LLMs,'' Journal of King Saud University Computer and Information Sciences, vol. 37, no. 8, Oct. 2025, Art. no. 252. DOI: https://doi.org/10.1007/s44443-025-00267-0

G. Graziano, D. Ucci, F. Bisio, and L. Oneto, ''PhishVision: A Deep Learning Based Visual Brand Impersonation Detector for Identifying Phishing Attacks,'' in Optimization, Learning Algorithms and Applications, vol. 1981, A. I. Pereira, A. Mendes, F. P. Fernandes, M. F. Pacheco, J. P. Coelho, and J. Lima, Springer Nature Switzerland, 2024, pp. 123–134. DOI: https://doi.org/10.1007/978-3-031-53025-8_9

A. Aljofey et al., ''An effective detection approach for phishing websites using URL and HTML features,'' Scientific Reports, vol. 12, no. 1, May 2022, Art. no. 8842. DOI: https://doi.org/10.1038/s41598-022-10841-5

S. Ariyadasa, S. Fernando, and S. Fernando, ''Combining Long-Term Recurrent Convolutional and Graph Convolutional Networks to Detect Phishing Sites Using URL and HTML,'' IEEE Access, vol. 10, pp. 82355–82375, 2022. DOI: https://doi.org/10.1109/ACCESS.2022.3196018

"PhishTank." OpenDNS community. https://community.opendns.com/phishtank/.

"OpenPhish - Phishing Intelligence." https://openphish.com/.

"Alexa Top 1 Million Sites." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/cheedcheed/top1m.

V. L. Pochat, T. V. Goethem, S. Tajalizadehkhoob, M. Korczyński, and W. Joosen, ''Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation,'' in Proceedings 2019 Network and Distributed System Security Symposium, San Diego, California, 2019. DOI: https://doi.org/10.14722/ndss.2019.23386

"Top 10 million websites based on Open data from Common Crawl & Common Search." https://www.domcop.com/top-10-million-websites.

"Lookup." ICANN. https://lookup.icann.org/.

Q. Peng, M. Zhang, D. Chang, J. Zhang, B. Liu, and H. Duan, ''Decoding DNS Centralization: Measuring and Identifying NS Domains Across Hosting Providers,'' in 2025 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Naples, Italy, June 2025, pp. 266–278. DOI: https://doi.org/10.1109/DSN64029.2025.00037

A. A. Albishri and M. M. Dessouky, ''A Comparative Analysis of Machine Learning Techniques for URL Phishing Detection,'' Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18495–18501, Dec. 2024. DOI: https://doi.org/10.48084/etasr.8920

J. Sun et al., ''HINCDG: Multi-Meta-Path Graph Auto-Encoders for Mining of Weak Association Malicious Domains,'' in Science of Cyber Security, vol. 13580, C. Su, K. Sakurai, and F. Liu, Springer International Publishing, 2022, pp. 393–406. DOI: https://doi.org/10.1007/978-3-031-17551-0_26

M. Anan, M. Nazzal, A. Khreishah, I. Khalil, N. Phan, and A. Sawalmeh, ''STING: A Stealthy Backdoor Attack on GNN-Based Malicious Domain Detection via DNS Perturbations,'' IEEE Open Journal of the Communications Society, vol. 6, pp. 7823–7841, 2025. DOI: https://doi.org/10.1109/OJCOMS.2025.3610784

A. Dosovitskiy et al., ''An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.'' arXiv, June 03, 2021.

S. Kavya and D. Sumathi, ''Multimodal and Temporal Graph Fusion Framework for Advanced Phishing Website Detection,'' IEEE Access, vol. 13, pp. 74128–74146, 2025. DOI: https://doi.org/10.1109/ACCESS.2025.3564530

C. Guo, G. Pleiss, Y. Sun, and K. Q. Weinberger, ''On Calibration of Modern Neural Networks,'' in Proceedings of the 34th International Conference on Machine Learning, Sydney, Aystralia, July 2017, pp. 1321–1330.

Downloads

How to Cite

[1]
A. Mohammad, “NetPhish-Mix: A Multi-Modal Phishing Detection Method Utilizing URL Graphs and Page Screenshot Vision Transformer”, Eng. Technol. Appl. Sci. Res., vol. 16, no. 1, pp. 31209–31214, Feb. 2026.

Metrics

Abstract Views: 502
PDF Downloads: 252

Metrics Information