English-Vietnamese Cross-Lingual Paraphrase Identification Using MT-DNN
Received: 18 June 2021 | Revised: 6 August 2021 | Accepted: 16 August 2021 | Online: 22 September 2021
Corresponding author: H. V. T. Chi
Abstract
Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, our proposed method based on MT-DNN [2] to detect similarities between English and Vietnamese sentences, is proposed. We changed the shared layers of the original MT-DNN from original the BERT [3] to other pre-trained multi-language models such as M-BERT [3] or XLM-R [4] so that our model could work on cross-language (in our case, English and Vietnamese) information retrieval. We also added some tasks as improvements to gain better results. As a result, we gained 2.3% and 2.5% increase in evaluated accuracy and F1. The proposed method was also implemented on other language pairs such as English – German and English – French. With those implementations, we got a 1.0%/0.7% improvement for English – German and a 0.7%/0.5% increase for English – French.
Keywords:
MT-DNN, BERT, XLM-R, English, Vietnamese, cross-language, paraphrase identificationDownloads
References
A. Amaral, "Paraphrase Identification and Applications in Finding Answers in FAQ Databases." 2013, [Online]. Available: https://fenix.tecnico.ulisboa.pt/downloadFile/395145918749/resumo.pdf.
X. Liu, P. He, W. Chen, and J. Gao, "Multi-Task Deep Neural Networks for Natural Language Understanding," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, Jul. 2019, pp. 4487-4496. https://doi.org/10.18653/v1/P19-1441
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," arXiv:1810.04805 [cs], May 2019, Accessed: Aug. 26, 2021. [Online]. Available: http://arxiv.org/abs/1810.04805.
A. Conneau et al., "Unsupervised Cross-lingual Representation Learning at Scale," in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, Jul. 2020, pp. 8440-8451. https://doi.org/10.18653/v1/2020.acl-main.747
L. T. Nguyen and D. Dien, "English- Vietnamese Cross-Language Paraphrase Identification Method," in Proceedings of the Eighth International Symposium on Information and Communication Technology, New York, NY, USA, Dec. 2017, pp. 42-49. https://doi.org/10.1145/3155133.3155187
D. Dinh and N. Le Thanh, "English-Vietnamese cross-language paraphrase identification using hybrid feature classes," Journal of Heuristics, Apr. 2019. https://doi.org/10.1007/s10732-019-09411-2
M. Mohamed and M. Oussalah, "A hybrid approach for paraphrase identification based on knowledge-enriched semantic heuristics," Language Resources and Evaluation, vol. 54, no. 2, pp. 457-485, Jun. 2020. https://doi.org/10.1007/s10579-019-09466-4
U. Khan, K. Khan, F. Hassan, A. Siddiqui, and M. Afaq, "Towards Achieving Machine Comprehension Using Deep Learning on Non-GPU Machines," Engineering, Technology & Applied Science Research, vol. 9, no. 4, pp. 4423-4427, Aug. 2019. https://doi.org/10.48084/etasr.2734
S. Mandava, S. Migacz, and A. F. Florea, "Pay Attention when Required," arXiv:2009.04534 [cs], May 2021, Accessed: Aug. 26, 2021. [Online]. Available: http://arxiv.org/abs/2009.04534.
B. Ahmed, G. Ali, A. Hussain, A. Baseer, and J. Ahmed, "Analysis of Text Feature Extractors using Deep Learning on Fake News," Engineering, Technology & Applied Science Research, vol. 11, no. 2, pp. 7001-7005, Apr. 2021. https://doi.org/10.48084/etasr.4069
R. Mihalcea, C. Corley, and C. Strapparava, "Corpus-based and knowledge-based measures of text semantic similarity," in Proceedings of the 21st national conference on Artificial intelligence, Boston, MA, USA, Jul. 2006, vol. 1, pp. 775-780.
W. Yin and H. Schütze, "Convolutional Neural Network for Paraphrase Identification," in Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, CO, USA, May 2015, pp. 901-911. https://doi.org/10.3115/v1/N15-1091
H. Shahmohammadi, M. Dezfoulian, and M. Mansoorizadeh, "Paraphrase detection using LSTM networks and handcrafted features," Multimedia Tools and Applications, vol. 80, no. 4, pp. 6479-6492, Feb. 2021. https://doi.org/10.1007/s11042-020-09996-y
R. Caruana, "Multitask Learning," Machine Learning, vol. 28, no. 1, pp. 41-75, Jul. 1997. https://doi.org/10.1023/A:1007379606734
M. Crawshaw, "Multi-Task Learning with Deep Neural Networks: A Survey," arXiv:2009.09796 [cs, stat], Sep. 2020, Accessed: Aug. 26, 2021. [Online]. Available: http://arxiv.org/abs/2009.09796.
A. Warstadt, A. Singh, and S. R. Bowman, "Neural Network Acceptability Judgments," Transactions of the Association for Computational Linguistics, vol. 7, pp. 625-641, Mar. 2019. https://doi.org/10.1162/tacl_a_00290
E. F. Tjong Kim Sang and F. De Meulder, "Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition," in Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, 2003, pp. 142-147. https://doi.org/10.3115/1119176.1119195
H. T. M. Nguyen, Q. T. Ngo, L. X. Vu, V. M. Tran, and H. T. T. Nguyen, "VLSP Shared Task: Named Entity Recognition," Journal of Computer Science and Cybernetics, vol. 34, no. 4, pp. 283-294, 2018. https://doi.org/10.15625/1813-9663/34/4/13161
A. Breit, A. Revenko, K. Rezaee, M. T. Pilehvar, and J. Camacho-Collados, "WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context," in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, Apr. 2021, pp. 1635-1645. https://doi.org/10.18653/v1/2021.eacl-main.140
I. Hendrickx et al., "SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations between Pairs of Nominals," in Proceedings of the 5th International Workshop on Semantic Evaluation, Uppsala, Sweden, Jul. 2010, pp. 33-38. https://doi.org/10.3115/1621969.1621986
A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman, "GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding," in Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium, Nov. 2018, pp. 353-355. https://doi.org/10.18653/v1/W18-5446
Downloads
How to Cite
License
Copyright (c) 2021 H. V. T. Chi, D. L. Anh, N. L. Thanh, D. Dinh
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.