A Transformer-Based Optical Character Recognition Framework with Unified Residual Recurrent Networks for Multilingual Handwritten Documents
Corresponding author: Dadapeer
Abstract
Multilingual handwritten Optical Character Recognition (OCR) faces major challenges due to diverse writing styles, script variations, and limited generalization across languages. Existing OCR systems often fail to handle multilingual handwritten data efficiently, resulting in poor segmentation and recognition accuracy. To overcome these challenges, this paper proposes a Transformer-based OCR architecture with a Unified Residual Recurrent Neural Network (TrOCR-URRNN-MLD). The proposed framework integrates Local Sample-Weighted Multiple Kernel Clustering (LSMKC) for effective text segmentation and Adaptive Multi-scale Gaussian Co-occurrence Filtering (AMGCF) for noise suppression and clarity enhancement. A Feature-Affine Residual Network (FA-ResNet) embedded in the Transformer extracts robust spatial-semantic features, while the URRNN with Connectionist Temporal Classification (CTC) efficiently models sequential dependencies. Furthermore, the Secretary Bird Optimization Algorithm (SBOA) optimizes the model parameters for improved performance. Experiments on IAM and Kannada Char74k datasets show that TrOCR-URRNN-MLD surpasses existing OCR models in accuracy, precision, recall, and F1-score.
Keywords:
handwritten Optical Character Recognition (OCR), transformer architecture, Unified Residual Recurrent Neural Network (URRNN), multilingual document recognition, feature-affine residual network (FA-ResNet), Secretary Bird Optimization Algorithm (SBOA)Downloads
References
D. Coquenet, C. Chatelain, and T. Paquet, "End-to-End Handwritten Paragraph Text Recognition Using a Vertical Attention Network," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 508–524, Jan. 2023. DOI: https://doi.org/10.1109/TPAMI.2022.3144899
K. M. O. Nahar et al., "Recognition of Arabic Air-Written Letters: Machine Learning, Convolutional Neural Networks, and Optical Character Recognition (OCR) Techniques," Sensors, vol. 23, no. 23, Jan. 2023, Art. no. 9475. DOI: https://doi.org/10.3390/s23239475
B. R. Kavitha and C. Srimathi, "Benchmarking on offline Handwritten Tamil Character Recognition using convolutional neural networks," Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 4, pp. 1183–1190, Apr. 2022. DOI: https://doi.org/10.1016/j.jksuci.2019.06.004
S. Vijayalakshmi, K. R. Kavitha, B. Saravanan, R. Ajaybaskar, and M. Makesh, "Handwritten Character Recognition for Tamil Language Using Convolutional Recurrent Neural Network," in Inventive Systems and Control, 2022, pp. 369–384. DOI: https://doi.org/10.1007/978-981-19-1012-8_25
M. Dhiaf, A. C. Rouhou, Y. Kessentini, and S. B. Salem, "MSdocTr-Lite: A lite transformer for full page multi-script handwriting recognition," Pattern Recognition Letters, vol. 169, pp. 28–34, May 2023. DOI: https://doi.org/10.1016/j.patrec.2023.03.020
J. Mukherjee and U. Roy, "A Low Resource Multi-lingual Simultaneous Script Identification and Text Recognition Model," SN Computer Science, vol. 5, no. 6, July 2024, Art. no. 740. DOI: https://doi.org/10.1007/s42979-024-03107-6
V. K. Chauhan, S. Singh, and A. Sharma, "HCR-Net: a deep learning based script independent handwritten character recognition network," Multimedia Tools and Applications, vol. 83, no. 32, pp. 78433–78467, Sept. 2024. DOI: https://doi.org/10.1007/s11042-024-18655-5
B. Rabhi, A. Elbaati, H. Boubaker, U. Pal, and A. M. Alimi, "Multi-lingual handwriting recovery framework based on convolutional denoising autoencoder with attention model," Multimedia Tools and Applications, vol. 83, no. 8, pp. 22295–22326, Mar. 2024. DOI: https://doi.org/10.1007/s11042-023-16499-z
T. Hasan, Md. A. Rahim, J. Shin, S. Nishimura, and Md. N. Hossain, "Dynamics of Digital Pen-Tablet: Handwriting Analysis for Person Identification Using Machine and Deep Learning Techniques," IEEE Access, vol. 12, pp. 8154–8177, 2024. DOI: https://doi.org/10.1109/ACCESS.2024.3352070
S. P. Ramteke, A. A. Gurjar, and D. S. Deshmukh, "A Novel Weighted SVM Classifier Based on SCA for Handwritten Marathi Character Recognition," IETE Journal of Research, vol. 68, no. 2, pp. 845–857, Mar. 2022. DOI: https://doi.org/10.1080/03772063.2019.1623093
N. Tripathi and P. S. Patheja, "Offline handwritten character recognition with nomograph-based IMVO feature mining with DSRNN-MaxEnt classification," Sādhanā, vol. 48, no. 4, Nov. 2023, Art. no. 272. DOI: https://doi.org/10.1007/s12046-023-02327-5
R. Malhotra and M. T. Addis, "End-to-End Historical Handwritten Ethiopic Text Recognition Using Deep Learning," IEEE Access, vol. 11, pp. 99535–99545, 2023. DOI: https://doi.org/10.1109/ACCESS.2023.3314334
B. A. Tuama and F. Mohamed, "A Systematic Literature Review of Deep Learning Methods for Handwritten Text Recognition in Historical Arabic Manuscripts," Engineering, Technology & Applied Science Research, vol. 15, no. 4, pp. 25772–25782, Aug. 2025. DOI: https://doi.org/10.48084/etasr.12123
A. Moudgil, S. Singh, V. Gautam, S. Rani, and S. H. Shah, "Handwritten devanagari manuscript characters recognition using capsnet," International Journal of Cognitive Computing in Engineering, vol. 4, pp. 47–54, June 2023. DOI: https://doi.org/10.1016/j.ijcce.2023.02.001
B. Kada, A. Mohammed, and B. Abdelmajid, "An Optimized Approach for Handwritten Arabic Character Recognition based on the SVM Classifier," Engineering, Technology & Applied Science Research, vol. 15, no. 2, pp. 22232–22238, Apr. 2025. DOI: https://doi.org/10.48084/etasr.9292
"English - IAM OCR dataset." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/python16/english-iam-ocr-dataset.
"kannada Char74k handwritten words dataset." Kaggle, [Online]. Available: https://www.kaggle.com/datasets/mcpython/kannada-char74k-handwritten-words-dataset.
L. Li et al., "Local Sample-Weighted Multiple Kernel Clustering With Consensus Discriminative Graph," IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 2, pp. 1721–1734, Oct. 2024. DOI: https://doi.org/10.1109/TNNLS.2022.3184970
X. Gong, Z. Hou, A. Ma, Y. Zhong, M. Zhang, and K. Lv, "An Adaptive Multiscale Gaussian Co-Occurrence Filtering Decomposition Method for Multispectral and SAR Image Fusion," IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 16, pp. 8215–8229, 2023. DOI: https://doi.org/10.1109/JSTARS.2023.3296505
L. Zhan, W. Li, and W. Min, "FA-ResNet: Feature affine residual network for large-scale point cloud segmentation," International Journal of Applied Earth Observation and Geoinformation, vol. 118, Apr. 2023, Art. no. 103259. DOI: https://doi.org/10.1016/j.jag.2023.103259
A. Al-Malahi, A. Farhan, H. Feng, O. Almaqtari, and B. Tang, "An intelligent radar signal classification and deinterleaving method with unified residual recurrent neural network," IET Radar, Sonar & Navigation, vol. 17, no. 8, pp. 1259–1276, 2023. DOI: https://doi.org/10.1049/rsn2.12417
Y. Fu, D. Liu, J. Chen, and L. He, "Secretary bird optimization algorithm: a new metaheuristic for solving global optimization problems," Artificial Intelligence Review, vol. 57, no. 5, Apr. 2024, Art. no. 123. DOI: https://doi.org/10.1007/s10462-024-10729-y
Downloads
How to Cite
License
Copyright (c) 2025 Dadapeer, Yeresime Suresh

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
