A Human-machine Cooperation Protocol for Machine Translation Output Edit Annotation
Abstract
We report on a study exploring automatic edit annotation in a post-editing corpus with a new method for computing edit types. We examine edit type association with quality scores assigned to the machine translation output and the post-edited texts. Finally, we account for shortcomings in our method and point out edit types worth leveraging.
Keywords
machine translation, human post-editing, automatic error analysis, human-machine cooperationReferences
Aziz, Wilker; Lucia Specia (2011). Fully automatic compilation of Portuguese-English and Portuguese-Spanish parallel corpora. In: Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology: Cuiabá, MT, Brazil, October 24-26, pp. 234-238. <https://aclanthology.org/W11-4533.pdf>. [Accessed: 20211207].
Aziz, Wilker; Castilho, Sheila; Specia, Lucia. (2012). PET: a Tool for Post-editing and Assessing Machine Translation. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC12). European Language Resources Association (ELRA), pp. 3982-3987. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/985_Paper.pdf>. [Accessed: 20211207].
Caseli, Helena; Marcio, Inácio (2020). NMT and PBSMT Error Analyses in English to Brazilian Portuguese Automatic Translations. In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020): Marseille, 11-16 May. European Language Resources Association (ELRA), pp. 3623-3629. <https://aclanthology.org/2020.lrec-1.446.pdf>. [Accessed: 20211207].
Chatterjee, Rajen; Federmann, Christian; Negri, Matteo; Turchi, Marco (2019). Findings of the WMT 2019 Shared Task on Automatic Post-Editing. In: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2): Florence, Italy, August. Association for Computational Linguistics, pp. 11–28. <https://dx.doi.org/10.18653/v1/W19-5402>. <https://aclanthology.org/W19-5402.pdf>. [Accessed: 20211207].
Costa, Felipe; Ferreira, Thiago; Pagano, Adriana; Meira, Wagner (2020). Building The First English-Brazilian Portuguese Corpus for Automatic Post-Editing. In: Proceedings of the 28th International Conference on Computational Linguistics: Barcelona, Spain (Online), December 8-13. International Committee on Computational Linguistics, pp. 6063–6069. <https://dx.doi.org/10.18653/v1/2020.coling-main.533>, <https://aclanthology.org/2020.coling-main.533.pdf>. [Accessed: 20211207].
Costa, Felipe; Ferreira, Thiago; Pagano, Adriana; Meira, Wagner. (2022, in press). Exploring Semantic Annotations to Measure Post-Editing Quality. In: Ji, Meng; Oakes, Michael P. (ed.). Corpus Exploration of Lexis and Discourse in Translation. London: Routledge.
De Almeida, Giselle. (2013). Translating the post-editor: an investigation of post-editing changes and correlations with professional experience across two Romance languages [PhD thesis]. School of Applied Language and Intercultural Studies, Dublin City University. <https://doras.dcu.ie/17732/>. [Accessed: 20211207].
Gardent, Claire; Shimorina, Anastasia; Narayan, Shashi; Perez-Beltrachini, Laura (2017). Creating Training Corpora for NLG Micro-Planning. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Vancouver, Canada, July 30-August 4 (Volume 1: Longs Papers). Association for Computational Linguistics, pp. 179-188. <https://dx.doi.org/10.18653/v1/P17-1017>, <https://aclanthology.org/P17-1017.pdf>. [Accessed: 20211207].
Gardent, Claire; Shimorina, Anastasia; Narayan, Shashi; Perez-Beltrachini, Laura (2017). The WebNLG Challenge: Generating Text from RDF Data. In: Proceedings of the 10th International Conference on Natural Language Generation: Santiago de Compostela, Spain, September 4-7. Association for Computational Linguistics, pp. 124-133. <https://dx.doi.org/10.18653/v1/W17-3518>, <https://aclanthology.org/W17-3518.pdf>. [Accessed: 20211207].
Gusfield, Dan (1997). Preface (Abridged) of Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Acm Sigact News, v. 28, n. 4, pp. 41-60.
Halliday, M.A.K. (aut.); Matthiessen, Christian M.I.M. (revised) (2014). Halliday's Introduction to Functional Grammar. 4th ed. Milton Park [etc.]: Routledge.
Läubli, Samuel; Sennrich, Rico; Volk, Martin (2018). Has Machine Translation Achieved Human Parity? A case for Document-level Evaluation [Preprint]. <https://arxiv.org/abs/1808.07048v1>. [Accessed: 20211207].
Levenshtein, Vladimir I. (1966). Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, v. 10, n. 8 (February), pp. 707-710. <https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf>. [Accessed: 20211207].
Popović, Maja; Ney, Hermann (2011). Towards Automatic Error Analysis of Machine Translation Output. Computational Linguistics, v. 37, n. 4 (December), pp. 657–688. <https://dx.doi.org/10.1162/COLI_a_00072>. [Accessed: 20211207].
Popović, Maja; Lommel, Arle; Burchardt, Aljoscha; Avramidis, Eleftherios; Uszkoreit, Hans. (2014). Relations between different types of post-editing operations, cognitive effort and temporal effort. In: Proceedings of the 17th Annual conference of the European Association for Machine Translation: Dubrovnik,Croatia, June 16-18. European Association for Machine Translation, pp. 191-198. <https://aclanthology.org/2014.eamt-1.41>, <https://aclanthology.org/2014.eamt-1.41.pdf>. [Accessed: 20211207].
Popović, Maja. (2018). Error Classification and Analysis for Machine Translation Quality Assessment. In: Moorkens, J.; et al. (eds.). Translation Quality Assessment. Cham: Springer International. (Machine Translation: Technologies and Applications; 1), pp. 129-158. <https://doi.org/10.1007/978-3-319-91241-7_7>. [Accessed: 20211207].
Popović, Maja. (2011). Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output. The Prague Bulletin of Mathematical Linguistics, n. 96 (October), pp. 59–67. <https://doi.org/10.2478/v10108-011-0011-4>, <https://www.readcube.com/articles/10.2478%2Fv10108-011-0011-4>. [Accessed: 20211207].
Snover, Matthew; Dorr, Bonnie; Schwartz, Richard; Micciulla, Linnea; Makhoul, John (2006). A Study of Translation Edit Rate with Targeted Human Annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers: Cambridge, August 8-12. The Association for Machine Translation in the Americas, pp. 223-231. <https://aclanthology.org/2006.amta-papers.25.pdf>. [Accessed: 20211207].
Snover, Matthew; Madnani, Nitin; Dorr, Bonnie J.; Schwartz, Richard (2009). Fluency, Adequacy or HTER? Exploring Different Human Judgments with a Tunable MT Metric. In: Proceedings of the Fourth Workshop on Statistical Machine Translation: Athens, Greece, 30-31 March. Association for Computational Linguistics, pp. 259-268. <https://aclanthology.org/W09-0441.pdf>. [Accessed: 20211207].
Toral, Antonio; Castilho, Sheila; Hu, Ke; Way, Andy (2018). Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation [Preprint]. . [Accessed: 20211207].
Turney, Peter D. (2001) Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. In: De Raedt, L.; Flach, P. (eds.). Machine Learning: ECML 2001. Berlin [etc.]: Springer. (Lecture Notes in Computer Science; 2167), pp. 491-502. <https://doi.org/10.1007/3-540-44795-4_42>. [Accessed: 20211207].
Published
Downloads
Copyright (c) 2021 Felipe de Almeida Costa, Thiago castro ferreira, Adriana S Pagano, Wagner Meira, Jr.

This work is licensed under a Creative Commons Attribution 4.0 International License.