The Riddle of (Literary) Machine Translation Quality
摘要
This study aims to gauge the reliability and validity of metrics and algorithms in evaluating the quality of machine translation in a literary context. Ten machine translated versions of a literary story, provided by four different MT engines over a period of three years, are compared applying two quantitative quality estimation scores (BLEU and a recently developed literariness algorithm). The comparative analysis provides an insight not only into the quality of stylistic and narratological features of machine translation, but also into more traditional quality criteria, such as accuracy and fluency. It is found that evaluations are not always in agreement and that they lack nuance. It is suggested that metrics and algorithms cover only parts of the notion of “quality”, and that a more fine-grained approach is needed if potential literary quality of machine translation is to be captured and possibly validated using those instruments.
关键词
literary machine translation, quality, literariness, automated metrics, machine learning参考
Barthelme, Donald. (1980). I Wrote a Letter. In: Kim Herzinger (ed.). The Teachings of Don B. Berkeley: satires, parodies, fables, illustrated stories and play of Donald Barthelme. Berkely, CA: Counterpoint Press.
Castilho, Sheila; Moorkens, Joss; Gaspari, Federico; Calixto, Iacer; Tinsley, John; Way, Andy (2017). Is Neural Machine Translation the New State of Art? The Prague Bulletin of Mathematical Linguistics, n. 108 (June), pp. 109-120. <https://core.ac.uk/download/pdf/195384513.pdf>. [Accessed: 20231211].
Chu, Chenhui; Dabre, Raj; Kurohashi, Sadao (2017). An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 385-391. <https://doi.org./10.18653/v1/P17-2061>. [Accessed: 20231211].
Couturier, Maurice; Durand, Regis (1982). Donald Barthelme. London [etc.]: Methuen. (Contemporary writers).
Do Carmo, Felix (2022). Debunking a few machine translation myths: from ‘zero-shot translation’ to ‘human parity’ and ‘no language left behind. University of Surrey (03 November). <https://www.surrey.ac.uk/news/convergence-lecture-series-debunking-few-machine-translation-myths-zero-shot-translation-human>. [Accessed: 20231211].
Gordon, Lois (1981). Donald Barthelme. Boston: Twayne. (Twayne’s United States authors series; TUSAS 416).
Guerberhof Arenas, Ana; Toral Antonio (2022). Creativity in translation; Machine translation as a constraint for literary texts. Translation Spaces, v. 11, n. 2, pp. 1-31. <https://doi.org/10.48550/arXiv.2204.05655>. [Accessed: 20231211].
Koehn, Philipp; Knowles, Rebecca (2017). Six challenges for neural machine translation. In: Thang Luong; Alexandra Birch; Graham Neubig; Andrew Finch (eds.). Proceedings of the First Workshop on Neural Machine Translation, Vancouver. Association for Computational Linguistics, pp. 28-39. <https://doi.org//10.18653/v1/W17-3204>. [Accessed: 20231211].
Koolen, Corina; Dalen-Oskam, Van Dalen-Oskam, Karina; Van Cranenburgh, Andreas; Nagelhout, Erica (2020). Literary quality in the eye of the Dutch reader: The National Reader Survey. Poetics, v. 97 (April). <https://doi.org/10.1016/j.poetic.2020.101439>. [Accessed: 20231211].
Kosmaczewska, Kasia; Train, Matt (2019). Application of Post-Edited Machine Translation in Fashion eCommerce. In: Mikel Forcada; Andy Way; John Tinsley; Dimitar Shterionov; Celia Rico; Federico Gaspari (eds.). Proceedings of MT Summit XVII: Translator, Project and User Tracks: august 2019, Dublin, Ireland. European Association for Machine Translation, pp. 167-173. <https://aclanthology.org/W19-6730>. [Accessed: 20231211].
Lavie, Alon; Agarwal, Abhaya (2007). METEOR: An automatic metric for MT evaluation with high levels of correlation with human judgments. In: Chris Callison-Burch; Philipp Koehn; Cameron Shaw Fordyce; Christof Monz (eds.). Proceedings of the Second ACL Workshop on Statistical Machine Translation: June 2007, Prague, pp. 228-231. <https://aclanthology.org/W07-0734>. [Accessed: 20231211].
Marie, Benjamin (2022a). Science Left Behind. Medium. <https://medium.com/@bnjmn_marie/science-left-behind-ca0a58231c20>. [Accessed: 20231211].
Marie, Benjamin (2022b). An Automatic Evaluation of the WMT22 General Machine Translation Task. ArXiv online <https://arxiv.org/pdf/2209.14172.pdf>. [Accessed: 20231211].
Matusov, Evgeny (2019). The Challenges of Using Neural Machine Translation for Literature. In: James Hadley; Maja Popovic; Haithem Afli; Andy Way. Proceedings of the Literary Machine Translation: August 2019, Dublin. European Association for machine Translation, pp. 10-19. <https://aclanthology.org/W19-7302.pdf>. [Accessed: 20231211].
McCaffery, Larry (1980). Donald Barthelme and the Metafictional Muse. Current Trends in American Fiction, v. 9, n. 27, pp. 78-88. <https://doi.org/10.2307/3683881>. [Accessed: 20231211].
Molesworth, Charles (1982). Donald Barthelme's Fiction: The Ironist Saved from Drowning. Columbia: University of Missouri Press.
Nord, Christiane (1988). Textanalyse und Übersetzen: Theoretische Grundlagen, Methode und didaktische Anwendung einer übersetzungsrelevanten Textanalyse. Heidelberg: Groos.
Papineni, Kishore; Roukos, Salim; Ward, Todd; Zhu, Wei-Jing (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL): Philadelphia, July 2002, pp. 311-318. <https://aclanthology.org/P02-1040.pdf>. [Accessed: 20231211].
Roe, Barbara Louise (1992). Donald Barthelme: A Study of the Short Fiction. New York: Twayne; Toronto: Maxwell Macmillan Canada; New York: Maxwell Macmillan International.
Saunders, Danielle (2022). Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey. Jair: Journal of Artificial Intelligence Research, v. 75, pp. 351-424. <https://doi.org/10.1613/jair.1.13566>. [Accessed: 20231211].
Speerstra, Nander (2018). A Comparison of Statistical and Neural MT in a Multi-Product and Multilingual Software Company: User Study. In: Juan Antonio Pérez-Ortiz, et al. (eds.). Proceedings of the 21st Annual Conference of the European Association for Machine Translation: 28-30 May 2018, Universitat d’Alacant: Alacant, pp. 315-321. <https://aclanthology.org/2018.eamt-main.34>. [Accessed: 20231211].
Taylor, David Michael (1977). Donald Barthelme: an approach to contemporary fiction [PhD Dissertation]. Florida State University. <https://www.proquest.com/docview/302853367?pq-origsite=gscholar&fromopenview=true>. [Accessed: 20231211].
Tezcan, Arda; Daems, Joke; Macken, Lieve (2019). When a ‘sport’ is a person and other issues for NMT of novels. In: James Hadley; Maja Popovic; Haithem Afli; Ande Way (eds.). Proceedings of the Qualities of Literary Machine Translation: Aug. 19-23, Dublin. European Association for Machine Translation, pp. 40-49. <https://aclanthology.org/W19-7306/>. [Accessed: 20231211].
Toral, Antonio; Van Cranenburg, Andreas; Nutters, Tia (2021). Literary-Adapted Machine Translation in a Well-Resourced Language Pair: Explorations with More Data and Wilder Context. In: Book of abstracts 7th Conference of The International Association for Translation and Inter-Cultural Studies (IATIS), Barcelona.
Toral, Antonio; Oliver, Antoni; Ribas Ballestín, Pau. (2020). Machine Translation of Novels in the Age of Transformer. In: Jörg Porsiel (ed.). Maschinelle Übersetzung für Übersetzungsprofis. Berlin: BDÜ-Fachverlag, pp.276-295. <https://arxiv.org/ftp/arxiv/papers/2011/2011.14979.pdf>. [Accessed: 20231211].
Toral, Antonio; Way, Andy (2018). What level of quality can neural machine translation attain on literary text? In: Joss Moorkens; Sheila Castilho; Federico Gaspari; Stephen Doherty (eds.). Translation Quality Assessment: From Principles to Practice. 1st ed. Cham: Springer International Publishing. <https://arxiv.org/pdf/1801.04962.pdf>. [Accessed: 20231211].
Toral, Antonio; Way, Andy (2014). Is Machine Translation Ready for Literature? Proceedings of Translating and the Computer, v. 36, pp. 174-176. <https://aclanthology.org/2014.tc-1.23/>. [Accessed: 20231211]
Van Cranenburgh, A.; van Dalen-Oskam, K.; van Zundert, J. (2019). Vector space explorations of literary language. Language Resources and Evaluation, v. 53, n 4 (December), pp. 625-650. <https://doi.org/10.1007/s10579-018-09442-4>. [Accessed: 20231211].
Van Dalen-Oskam, Karina (2021) Het raadsel literatuur. Is literaire kwaliteit meetbaar? Amsterdam: Amsterdam University Press.
Van Egdom, Gys-Walt; Bloemen, Henri; Segers, Winibert (2017). Machinevertaling, singularity et prometheische Scham. Filter: tijdschrift over vertalen, v. 24, n. 2, pp- 19-26. <https://www.tijdschrift-filter.nl/jaargangen/2017/242/machinevertaling-singularity-et-prometeische-scham-19-26/>. [Accessed: 20231211].
Voigt, Rob; Jurafski, Dan (2012). Towards a Literary Machine Translation: The Role of Referential Cohesion. In: David Elson; Anna Kazantseva; Rada Mihalcea; Stan Spakowicz (eds.). Workshop on Computational Linguistics for Literature: June 2012, Montréal, Canada. Association for Computational Linguistics, pp. 18-25. <https://aclanthology.org/W12-2503/>. [Accessed: 20231211].
Webster, Rebecca; Fonteyne, Margot; Tezcan, Arda; Macken, Lieve; Daems, Joke. (2020). Gutenberg Goes Neural: Comparing Features of Dutch Human Translations with Raw Neural Machine Translation Outputs in a Corpus of English Literary Classics. Informatics, v. 7, n. 3, 32 p. <https://doi.org/10.3390/informatics7030032>. [Accessed: 20231211].
Wu, Yonghui; et al. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. ArXiv online. <https://doi.org/10.48550/arXiv.1609.08144>. [20231211].
已出版
Downloads
Copyright (c) 2023 Gys-Walt van Egdom, Onno Kosters, Christophe Declercq

This work is licensed under a Creative Commons Attribution 4.0 International License.