Enhancing Neural Arabic Machine Translation using Character-Level CNN-BILSTM and Hybrid Attention
Received: 12 July 2024 | Revised: 30 July 2024 | Accepted: 4 August 2024 | Online: 23 August 2024
Corresponding author: Dhaya Eddine Messaoudi
Abstract
Neural Machine Translation (NMT) has made significant strides in recent years, especially with the advent of deep learning, which has greatly enhanced performance across various Natural Language Processing (NLP) tasks. Despite these advances, NMT still falls short of perfect translation, facing ongoing challenges such as limited training data, handling rare words, and managing syntactic and semantic dependencies. This study introduces a multichannel character-level NMT model with hybrid attention for Arabic-English translation. The proposed approach addresses issues such as rare words and word alignment by encoding characters, incorporating Arabic word segmentation as handcrafted features, and using part-of-speech tagging in a multichannel CNN-BiLSTM encoder. The model then uses a Bi-LSTM decoder with hybrid attention to generate target language sentences. The proposed model was tested on a subset of the OPUS-100 dataset, achieving promising results.
Keywords:
Arabic natural language processing, deep-learning, machine translation, deep CNN Bi-LSTM, hybrid attention, PoS-tagging, Arabic word segmentationDownloads
References
P. Koehn, F. J. Och, and D. Marcu, "Statistical phrase-based translation," in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - NAACL ’03, Edmonton, Canada, 2003, vol. 1, pp. 48–54.
D. Chopra, N. Joshi, and I. Mathur, "A Review on Machine Translation in Indian Languages," Engineering, Technology & Applied Science Research, vol. 8, no. 5, pp. 3475–3478, Oct. 2018.
N. Kalchbrenner and P. Blunsom, "Recurrent Continuous Translation Models," in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, Oct. 2013, pp. 1700–1779.
I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to Sequence Learning with Neural Networks," in Advances in Neural Information Processing Systems, 2014, vol. 27.
K. Cho et al., "Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1724–1734.
D. Bahdanau, K. Cho, and Y. Bengio, "Neural Machine Translation by Jointly Learning to Align and Translate." arXiv, May 19, 2016.
A. Vaswani et al., "Attention Is All You Need." arXiv, Aug. 01, 2023.
T. Luong, H. Pham, and C. D. Manning, "Effective Approaches to Attention-based Neural Machine Translation," in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1412–1421.
J. Cheng, L. Dong, and M. Lapata, "Long Short-Term Memory-Networks for Machine Reading." arXiv, Sep. 20, 2016.
N. Kalchbrenner, L. Espeholt, K. Simonyan, A. van den Oord, A. Graves, and K. Kavukcuoglu, "Neural Machine Translation in Linear Time." arXiv, Mar. 15, 2017.
L. Kaiser, A. N. Gomez, and F. Chollet, "Depthwise Separable Convolutions for Neural Machine Translation." arXiv, Jun. 15, 2017.
Y. Wu et al., "Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation." arXiv, Oct. 08, 2016.
B. Zhang, P. Williams, I. Titov, and R. Sennrich, "Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation." arXiv, Apr. 24, 2020.
F. Aqlan, X. Fan, A. Alqwbani, and A. Al-Mansoub, "Arabic–Chinese Neural Machine Translation: Romanized Arabic as Subword Unit for Arabic-sourced Translation," IEEE Access, vol. 7, pp. 133122–133135, 2019.
J. Chung, K. Cho, and Y. Bengio, "A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation." arXiv, Jun. 20, 2016.
F. Wang, W. Chen, Z. Yang, S. Xu, and B. Xu, "Hybrid Attention for Chinese Character-Level Neural Machine Translation," Neurocomputing, vol. 358, pp. 44–52, Sep. 2019.
S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, Aug. 1997.
P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, and J. Shlens, "Stand-Alone Self-Attention in Vision Models," in Advances in Neural Information Processing Systems, 2019, vol. 32.
S. Ding, A. Renduchintala, and K. Duh, "A Call for Prudent Choice of Subword Merge Operations in Neural Machine Translation." arXiv, Jun. 24, 2019.
D. Ataman, W. Aziz, and A. Birch, "A Latent Morphology Model for Open-Vocabulary Neural Machine Translation." arXiv, Feb. 26, 2020.
H. Sajjad, F. Dalvi, N. Durrani, A. Abdelali, Y. Belinkov, and S. Vogel, "Challenging Language-Dependent Segmentation for Arabic: An Application to Machine Translation and Part-of-Speech Tagging." arXiv, Sep. 02, 2017.
M. Oudah, A. Almahairi, and N. Habash, "The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation," in Proceedings of MT Summit XVII, Dublin, Ireland, Aug. 2019, vol. 1, pp. 214–221.
E. H. Almansor and A. Al-Ani, "A Hybrid Neural Machine Translation Technique for Translating Low Resource Languages," in Machine Learning and Data Mining in Pattern Recognition, New York, NY, USA, Jul. 2018, pp. 347–356.
D. Ataman, O. Firat, M. A. Di Gangi, M. Federico, and A. Birch, "On the Importance of Word Boundaries in Character-level Neural Machine Translation." arXiv, Oct. 21, 2019.
M. Alkhatib and K. Shaalan, "Paraphrasing Arabic Metaphor with Neural Machine Translation," Procedia Computer Science, vol. 142, pp. 308–314, Jan. 2018.
Y. Belinkov, N. Durrani, F. Dalvi, H. Sajjad, and J. Glass, "What do Neural Machine Translation Models Learn about Morphology?," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, 2017, pp. 861–872.
Y. LeCun et al., "Handwritten Digit Recognition with a Back-Propagation Network," in Advances in Neural Information Processing Systems, 1989, vol. 2.
J. Yim, J. Ju, H. Jung, and J. Kim, "Image Classification Using Convolutional Neural Networks With Multi-stage Feature," in Robot Intelligence Technology and Applications 3, 2015, pp. 587–594.
R. Collobert and J. Weston, "A unified architecture for natural language processing: deep neural networks with multitask learning," in Proceedings of the 25th international conference on Machine learning, Helsini, Finland, Apr. 2008, pp. 160–167.
S. S. Yadav and S. M. Jadhav, "Deep convolutional neural network based medical image classification for disease diagnosis," Journal of Big Data, vol. 6, no. 1, Dec. 2019, Art. no. 113.
A. Graves, "Generating Sequences With Recurrent Neural Networks." arXiv, Jun. 05, 2014.
R. Kiros, R. Salakhutdinov, and R. Zemel, "Multimodal Neural Language Models," in Proceedings of the 31st International Conference on Machine Learning, Jun. 2014, pp. 595–603.
O. Vinyals, Ł. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton, "Grammar as a Foreign Language," in Advances in Neural Information Processing Systems, 2015, vol. 28.
D. E. Messaoudi, D. Nessah, and A. Siam, "Intelligent system for part-of-speech tagging using convolutional neural network on arabic language," in The 2nd International Conference on Distributed Sensing and Intelligent Systems (ICDSIS 2021), Jul. 2021, vol. 2021, pp. 207–219.
X. Zhang, J. Zhao, and Y. LeCun, "Character-level Convolutional Networks for Text Classification," in Advances in Neural Information Processing Systems, 2015, vol. 28.
"OPUS-100 Corpus." [Online]. Available: https://opus.nlpl.eu/opus-100.php.
Downloads
How to Cite
License
Copyright (c) 2024 Dhaya Eddine Messaoudi, Djamel Nessah
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.