Què corregeixen els posteditors? Una anàlisi detallada dels errors de la traducció automàtica estadística (TAE) i la traducció automàtica neuronal (TAN)

Autors/ores

Resum

Les millores recents en la TA neuronal (TAN) han impulsat un canvi de la TA estadística (TAE) a la TAN. Tanmateix, per avaluar la utilitat dels models de TA per a la postedició (PE), és fonamental analitzar els errors més freqüents i com afecten la tasca. Presentem un estudi pilot d'una anàlisi detallada dels errors de la TA basat en correccions de postedició d’un text mèdic traduït de l'anglès al castellà amb TAE i TAN. Hem utilitzat la taxonomia MQM per comparar els dos models de TA i hem classificat els errors produïts. La nostra anàlisi també inclou una avaluació de la variació entre els posteditors, que se centra en els passatges amb una major variació en la postedició.

Paraules clau

traducció automàtica, TA, TAN, postedició, traducció automàtica neuronal, taxonomia d'errors

Referències

Allen, J. H. (2003). Post-editing. In: Sommer, H. (ed.). Computers and Translation: A translator’s guide. Amsterdam: John Benjamin. (Benjamins translation library; 35), pp. 297-317.

Alvarez-Vidal, S.; Oliver, A.; Badia, T. (2021). Comparing NMT and PBSMT for Post-editing In-domain Formal Texts: A Case Study. In: Tra&Co Group (ed.). Translation, interpreting, cognition: The way out of the box. Berlin: Language Science, pp. 33-47. <https://doi.org/10.5281/zenodo.4544686>. [Accessed: 20211117].

Aranberri, N. (2014). Postediting, productivity and quality. Tradumàtica, Tecnologies de la traducció, n. 2012, pp. 471-477. <https://doi.org/10.5565/rev/tradumatica.62>. [Accessed: 20211117].

Aziz, W.; Castilho, S.; Specia, L. (2012). “PET: A Tool for Post-editing and Assessing Machine Translation.” In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), pp. 3982–3987. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/985_Paper.pdf>. [Accessed: 20211117].

Barrault, L.; Bojar, O.; Costa-Jussà, M.; Federmann, C.; Fishel, M.; Graham, Y.; Haddow, B.; Huck, M.; Koehn, P.; Malmasi, S.; Monz, C.; Müller, M.; Pal, S.; Post, M.; Zampieri, M. (2019). Findings of the 2019 Conference on Machine Translation (WMT19). In: Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1). Association for Computational Linguistics, pp. 1-61. <https://doi.org/10.18653/v1/W19-5301>. [Accessed: 20211117].

Bentivogli, L.; Bisazza, A.; Cettolo, M.; Federico, M. (2016). Neural versus Phrase-Based Machine Translation Quality: a Case Study. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 257-267. <https://doi.org/10.18653/v1/D16-1025>. [Accessed: 20211117].

Bojar, O.; Federmann, C.; Fishel, M.; Graham, Y.; Haddow, B.; Huck, M.; Koehn, P.; Monz, C. (2018). Findings of the 2018 Conference on Machine Translation (WMT18). In: Proceedings of the Third Conference on Machine Translation Shared Task Papers. Association for Computational Linguistics, pp. 272-303. <https://doi.org/10.18653/v1/W18-6401>. [Accessed: 20211117].

Castilho, S.; Moorkens, J.; Gaspari, F.; Sennrich, R.; Sosoni, V.; Georgakopoulou, Y.; Lohar, P.; Way, A.; Miceli Barone, A.; Gialama, M. (2017). A comparative quality evaluation of PBSMT and NMT using professional translators. In: Machine Translation Summit XVI: Proceedings of MT Summit XVI, vol.1: Research Track. Kyoto: Sadao Kurohashi; Hong Kong: Pascale Fung, pp. 116–131. <http://aamt.info/app-def/S-102/mtsummit/2017/wp-content/uploads/sites/2/2017/09/MTSummitXVI_ResearchTrack.pdf>. [Accessed: 20211117].

Costa, A.; Ling, W.; Luis, T.; Correia, R.; Coheur, L. (2015) A linguistically motivated taxonomy for Machine Translation error analysis. Machine Translation, v. 29, n. 2, pp.127–161. <https://doi.org/10.1007/s10590-015-9169-0>. [Accessed: 20211117].

Daems, J.; Vandepitte, S.; Hartsuiker, R. J.; Macken, L. (2017). Identifying the Machine Translation Error Types with the Greatest Impact on Post-editing Effort. Frontiers in Psychology, n. 8. <https://doi.org/10.3389/fpsyg.2017.01282>. [Accessed: 20211117].

De Almeida, G. (2013). Translating the post-editor: An investigation of post-editing changes and correlations with professional experience [PhD Thesis]. Dublin City University, Dublin. <http://doras.dcu.ie/17732/>. [Accessed: 20211117].

Denkowski, M.; Lavie, L. (2012). Challenges in predicting machine translation utility for human post-editors. In: Proceedings of AMTA 2012. <https://doi.org/10.1184/r1/6473105>. [Accessed: 20211117].

Farrús, M.; Costa-Jussà, M. R.; Mariño, J. B.; Fonollosa, J. A. R. (2010). Linguistic-based Evaluation Criteria to Identify Statistical Machine Translation Errors. In: Proceeding of the 14th Annual Conference of the European Association for Machine Translation (EAMT 2010), Saint-Raphal, France, pp. 167–173.

<https://repositori.upf.edu/bitstream/handle/10230/34496/Farrus_EAMT2010_ling.pdf?sequence=1&isAllowed=y>. [Accessed: 20211117].

Federico, M.; Negri, M.; Bentivogli, L.; Turchi, M. (2014). Assessing the Impact of Translation Errors on Machine Translation Quality with Mixed-effects Models. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, pp. 1643–1653. <https://doi.org/10.3115/v1/D14-1172>. [Accessed: 20211117].

Forcada, M. L.; Ginestı́-Rosell, M.; Nordfalk, J.; O’Regan, J.; Ortiz-Rojas, S.; Pérez-Ortiz, J. A.; Sánchez-Martı́nez, F.; Ramı́rez-Sánchez, G.; Tyers, F. M. (2011). Apertium: a free/open-source platform for rule-based machine translation. Machine translation, v. 25, n. 2, pp. 127–144. <https://doi.org/10.1007/s10590-011-9090-0>. [Accessed: 20211117].

Germann, U.; Barbu, E.; Bentivogli, L.; Bertoldi, N.; Bogoychev, N.; Buck, C.; Caroselli, D.; Carvalho, L.; Cattelan, A.; Cattoni, R.; et al. (2016). Modern MT: A New Open-source Machine Translation Platform for the Translation Industry. Baltic Journal of Modern Computing, vol. 4, no. 2. <http://www.bjmc.lu.lv/fileadmin/user_upload/lu_portal/projekti/bjmc/Contents/4_2_28_Products.pdf>. [Accessed: 20211117].

Guerberof, A. (2009). Productivity and quality in MT post-editing. In: Proceedings of MT Summit XII, pp. 8-13. <https://www.researchgate.net/publication/320467106_Productivity_and_quality_in_MT_post-editing>. [Accessed: 20211117].

Hawakaya, T.; Arase, Y. (2020). Fine-Grained Error Analysis on English-to-Japanese Machine Translation in the Medical Domain. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, Lisboa, Portugal. European Association for Machine Translation, pp. 155-164. <https://www.aclweb.org/anthology/2020.eamt-1.17.pdf>. [Accessed: 20211117].

Junczys-Dowmunt, M.; Dwojak, T.; Hoang, H. (2016). Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions. In: Proceedings of the Ninth International Workshop on Spoken Language Translation (IWSLT). <https://arxiv.org/abs/1610.01108>. [Accessed: 20211117].

Klubička, F.; Toral, A.; Sánchez-Cartagena, V. M. (2017). Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation. The Prague Bulletin of Mathematical Linguistics, n. 108, pp. 121–132. <https://doi.org/10.1515/pralin-2017-0014>. [Accessed: 20211117].

Klubička, F.; Toral, A.; Sánchez-Cartagena, V. M. (2018). Quantitative fine-grained human evaluation of machine translation systems: a case study on English to Croatian. Machine Translation, n. 32, pp. 195–215. <https://doi.org/10.1007/s10590-018-9214-x>. [Accessed: 20211117].

Koehn, P. (2005). Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the MT Summit, vol. 5, pp. 79–86. <https://www.statmt.org/europarl/>. [Accessed: 20211117].

Koponen, M. (2013). This translation is not too bad: An analysis of post-editor choices in a machine translation post-editing task. In: O’Brien, S.; Simard, M.; Specia, L. (eds.). Workshop Proceeding: Workshop on Post-editing Technology and Practice (WPTP-2). Allschwil: The European Association for Machine Translation, pp. 1-9. <https://www.researchgate.net/publication/299347281_This_translation_is_not_too_bad_An_analysis_of_post-editor_choices_in_a_machine_translation_post-editing_task> [Accessed: 20211117].

Koponen, M.; Leena, S. (2017). Post-editing quality: analyzing the correctness and necessity of post-editor corrections. Linguist Antverp, New Series Themes in Translation Studies, v. 16, pp. 137–148. <https://lans-tts.uantwerpen.be/index.php/LANS-TTS/article/view/439>. [Accessed: 20211117].

Koponen, M.; Leena, S.; Nikulin, M. (2019). A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output. Machine Translation, v. 33, pp. 61–90. <https://doi.org/10.1007/s10590-019-09228-7>. [Accessed: 20211117].

Krings, H. P. (2001). Repairing texts: Empirical investigations of machine translation post-editing process. Kent, OH: The Kent State University Press. (Translation studies; 5).

Lommel, A.; Burchardt, A.; Görög, A.; Uszkoreit, H.; Melby, A. K. (2015). Multidimensional Quality Metrics (MQM) Issue Types. <http://www.qt21.eu/mqm-definition/issues-list-2015-12-30.html>. [Accessed: 20211117].

Lommel, A.; Uszkoreit, H.; Burchardt, A. (2014). Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics. Tradumàtica, Tecnologies de la Traducció, n. 12, pp. 455-463. <https://doi.org/10.5565/rev/tradumatica.77>. [Accessed: 20211117].

Lommel, A. R.; DePalma, D. A. (2016). Europe’s Leading Role in Machine Translation: How Europe Is Driving the Shift to MT: Technical report. <http://cracker-project.eu/csa-mt-report/>. [Accessed: 20211117].

O’Brien, S. (2012). Towards a Dynamic Quality Evaluation Model for Translation. JosTrans, The Journal of Specialised Translation, n. 17, pp. 55–77. <https://jostrans.org/archive.php?display=17>. [Accessed: 20211117].

Papineni, K.; Roukos, S.; Ward, T.; Zhu, W. J. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. <https://doi.org/10.3115/1073083.1073135>. [Accessed: 20211117].

Popovic, M. (2011). Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output. The Prague Bulletin of Mathematical Linguistics, v. 96 (october), pp. 59-68. <https://ufal.mff.cuni.cz/pbml/96>. [Accessed: 20211117].

Popović, M.; Lommel, A.; Burchardt, A., Avramidis, E.; Uszkoreit, H. (2014). Relations between different types of post-editing operations, cognitive effort and temporal effort. In: Proceedings of the 17th Annual Conference of the European Association for Machine Translation. Allschwil: The European Association for Machine Translation, pp. 191-198. <https://www.aclweb.org/anthology/2014.eamt-1.41.pdf> [Accessed: 20211117].

Popovic, M.; Arcan, M. (2016). PE2rr Corpus: Manual Error Annotation of Automatically Pre-annotated MT Post-edits. LREC, pp. 27-32. <https://www.aclweb.org/anthology/L16-1005.pdf>. [Accessed: 20211117].

Popović, M. (2018). Error Classification and Analysis for Machine Translation Quality Assessment. In: Moorkens J.; Castilho S.; Gaspari F.; Doherty S. (eds.). Translation Quality Assessment. Machine Translation: Technologies and Applications, vol 1. Cham: Springer. <https://doi.org/10.1007/978-3-319-91241-7_7>. [Accessed: 20211117].

Shterionov, D.; Superbo, R.; Nagle, P.; Casanellas, L.; O’Dowd, T.; Way, A. (2018). Human versus Automatic Quality Evaluation of NMT and PBSMT. Machine Translation, v. 32, n. 3, pp. 217–235. <https://doi.org/10.1007/s10590-018-9220-z>. [Accessed: 20211117].

Toral, A.; Sánchez-Cartagena, V. M. (2017). A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Vol. 1, Long Papers (Valencia), pp. 1063–1073. <https://www.aclweb.org/anthology/E17-1100.pdf>. [Accessed: 20211117].

Vilar, D.; Xu, J.; D’Haro, L. F.; Ney, H. (2006). Error Analysis of Machine Translation Output. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, Genoa, Italy 2006, pp. 697–702. <http://www.lrec-conf.org/proceedings/lrec2006/pdf/413_pdf.pdf>. [Accessed: 20211117].

Villegas, M.; Intxaurrondo, A.; Gonzalez-Agirre, A.; Marimon, M.; Krallinger, M. (2018). The MeSpEN Resource for English-Spanish Medical Machine Translation and Terminologies: Census of Parallel Corpora, Glossaries and Term Translations. In: LREC MultilingualBIO: Multilingual Biomedical Text Processing. ELRA. <http://lrec-conf.org/workshops/lrec2018/W3/pdf/8_W3.pdf>. [Accessed: 20211117].

Wu, Y.; Schuster, M.; Chen, Z.; Le, Q. L.; Norouziet, M.; et al. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. <https://arxiv.org/abs/1609.08144>. [Accessed: 20211117].

Ye, Y.; Toral, A. (2020). Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese. In: Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, Lisboa, Portugal. European Association for Machine Translation, pp. 125-134. <https://eamt2020.inesc-id.pt/proceedings-eamt2020.pdf>. [Accessed: 20211117].

Zeman, D.; Fishel, M.; Berka, J.; Bojar, O. (2011). Addicter: What is wrong with my translations? The Prague Bulletin of Mathematical Linguistics, v. 96 (october), pp. 79-88. <https://doi.org/10.2478/v10108-011-0013-2>. [Accessed: 20211117].

Biografies de l'autor/a

Sergi Alvarez-Vidal, Universitat Pompeu Fabra

Professor associat, Universitat Pompeu Fabra

Antoni Oliver, Universitat Oberta de Catalunya

Associate Professor

Toni Badia, Universitat Pompeu Fabra

Emeritus Professor

Publicades

2021-12-31

Descàrregues

Les dades de descàrrega encara no estan disponibles.