Neural Machine Translation and Statistical Machine Translation: Perception and Productivity

Authors

Abstract

The machine translation field has changed completely due to the many advances seen in neural machine translation (NMT) engines, especially in comparison with the results that were obtained with statistical machine translation (SMT) engines. So, it is necessary to review not only how MT is used but also how it is perceived by the end users, the translators. The main objective of this study is to determine the perception and productivity of a group of translators using SMT and NMT systems in terms of time and edit distance. Via the TAUS Dynamic Quality platform, ten professional translators first evaluated raw machine translation segments from two different texts – a user guide and a marketing text – proposed by the Microsoft Translation Engine (SMT) and Google Neural Machine Translation (NMT). Six of the ten translators subsequently post-edited two productivity tests to determine time and edit distance. The results show that translators perceive the NMT system as more productive because, according to their perception, it would take less time to post-edit and would mean fewer editions. However, when comparing these results with those obtained in productivity tests, although the edit distance was shorter when using the SMT engine than with the NTM, the post-editing time is much longer for the neural engine.

Keywords

Neural Machine Translation, Statistical Machine Translation, productivity, edit distance, post-editing time, perception, post-editing

References

Aranberri, N. (2014). Posedición, productividad y calidad. Tradumàtica: Tecnologies de la Traducció, n. 12, pp. 471–477. <http://revistes.uab.cat/tradumatica/article/view/n12-aranberri>. <https://doi.org/10.5565/rev/tradumatica.62>

Bentivogli, L.; Bisazza, A.; Cettolo, M.; Federico, M. (2016). Neural versus Phrase-Based Machine Translation Quality: a Case Study, in: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, Texas: Association for Computational Linguistics, pp. 257-267. <https://www.aclweb.org/anthology/D16-1025.pdf. https://doi.org/10.18653/v1/D16-1025>.

Esperança-Rodier, E.; Rossi, C.; Bérard, A.; Besacier, L. (2017). Evaluation of NMT and SMT Systems: A Study on Uses and Perceptions, in: Proceedings of the 39th Conference Translating and the Computer. London: AsLing, pp. 11-24. <https://www.semanticscholar.org/paper/Evaluation-of-NMT-and-SMT-Systems%3A-A-Study-on-Uses-Esperan%C3%A7a-Rodier-Berard/58b0ffc3892f0f24594aa424a60de0f18aab970a>.

Görög, A. (2014). Quality evaluation today: the Dynamic Quality Framework, in: Proceedings of Translating and the Computer 36: ASLING: Proceedings. Geneva: Tradulex, pp. 155-164. <http://www.tradulex.com/varia/TC36-london2014.pdf>.

Guerberof, A. (2009). Productivity and quality in MT post-editing, in: Proceedings of the Twelfth Machine Translation Summit (MT Summit XII), Beyond Translation Memories: New Tools for Translators Workshop. Ottawa, Canadá: MT Summit. <http://www.mt-archive.info/MTS-2009-Guerberof.pdf>.

Guerberof, A. (2013). What do professional translators think about post-editing?The Journal of Specialised Translation, n. 19, pp. 75–95. <http://www.jostrans.org/issue19/art_guerberof.php>.

Katan, D. (2016). Translation at the cross-roads: Time for the transcreational turn? Perspectives. Studies in Translatology, v. 24, n. 3, pp. 365–381. <https://doi.org/10.1080/0907676X.2015.1016049>

Koponen, M. (2012). Comparing human perceptions of post-editing effort with post-editing operations, in: Proceedings of the 7th Workshop on Statistical Machine Translation. Montreal, Canadá: Association for Computational Linguistics, pp. 181-190. <https://www.aclweb.org/anthology/W12-3123>.

Krings, H. P. (2001). Repairing texts. Kent: Kent State University Press.

Lohar, P.; Popovic, M.; Afli, H.; Way, A. (2019). A Systematic Comparison Between SMT and NMT on Translating User-Generated Content, in: Proceedings of CICLing 2019, the 20th International Conference on Computational Linguistics and Intelligent Text Processing, La Rochelle, France. <https://www.computing.dcu.ie/~away/PUBS/2019/A_Systematic_Comparison_Between_SMT_and_NMT_on_Translating_User_Generated_Content.pdf>

Moorkens, J. (2017). Under pressure: translation in times of austerity. Perspectives, v. 25, n. 3, pp. 464-477. <https://doi.org/10.1080/0907676X.2017.1285331>.

Moorkens, J.; Toral, A.; Castilho, S.; Way, A. (2018). Translators’ perceptions of literary post-editing using statistical and neural machine translation. Translation Spaces, v. 7, n. 2, pp. 240-262. <https://www.rug.nl/research/portal/publications/translators-perceptions-of-literary-postediting-using-statistical-and-neural-machine-translation (af141520-ca42-4b37-a779-a5f4b829f39e).html. https://doi.org/10.1075/ts.18014.moo>

Papineni, K.; Roukos, S.; Ward, T.; Zhu, W. J. (2002). Bleu: A Method for Automatic Evaluation of Machine Translation, in: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Filadelfia: Association for Computational Linguistics, pp. 311-318. <https://doi.org/10.3115/1073083.1073135>.

Shterionov, D.; Nagle, P.; Casanellas, L.; Superbo, R.; O’Dowd, T. (2017). Empirical evaluation of NMT and PBSMT quality for large-scale translation production, in: Proceedings of the Annual Conference of the European Association for Machine Translation (EAMT): User Track. Praga: European Association of Machine Translation, pp. 74-79. <https://ufal.mff.cuni.cz/eamt2017/user-project-product-papers/papers/user/EAMT2017_paper_83.pdf>.

Snover, M.; Dorr, B.; Schwartz, R.; Micciulla, L.; Makhoul, J. (2006). A study of translation edit rate with targeted human annotation, in: Proceedings of the 7th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2006). Cambridge, Massachusetts: Association for Machine Translation in the Americas.

Torres-Hostench, O.; Presas, M.; Cid-Leal, P. (2016). El uso de traducción automática y posedición en las empresas de servicios lingüísticos españolas: informe de investigación ProjecTA 2015. Bellaterra, Cerdanyola del Vallès. <https://ddd.uab.cat/record/166753>.

Vilar, D.; Xu, J.; D’Haro L. F.; et al. (2006). Error analysis of statistical machine translation output, in: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06). European Language Resources Association. Genoa, Italia, pp. 697-702. <http://www.lrec-conf.org/proceedings/lrec2006/pdf/413_pdf.pdf>.

Way, A. (2013). Traditional and Emerging Use-Cases for Machine Translation, in: Proceedings of Translating and the Computer 35. London. <https://www.computing.dcu.ie/~away/PUBS/2013/Way_ASLIB_2013.pdf>.

Wołk, K.; Koržinek, D. (2017). Comparison and Adaptation of Automatic Evaluation Metrics for Quality Assessment of Re-Speaking. Computer Science, v. 18, n. 2, pp. 129. <https://doi.org/10.7494/csci.2017.18.2.129>.

Published

2023-03-07

Downloads