Empobriment lèxic en la traducció automàtica neuronal: estudi automàtic i humà de la traducció de substantius entre el castellà i l'anglès

Autors/ores

  • Míriam Espinosa Giménez
  • Mikel L. Forcada Universitat d'Alacant

Resum

La informació necessària per a traduir una paraula en traducció automàtica d'oracions aïllades sovint falta o no se'n pot extraure. Mostrem que la diversitat en la distribució del conjunt d'entrenament dels equivalents espanyols dels substantius anglesos es redueix en la traducció. Una enquesta amb traductors revela que això sovint es deu a la manca de context d'origen.

Paraules clau

traducció automàtica, espanyol-anglès, empobriment lèxic, equivalent lèxic, distorsió de la distribució

Referències

Baker, Mona (1993). Corpus linguistics and translation studies: Implications and applications. In Francis, G.; Baker, M.; Tognini-Bonelli, E. (eds.). Text and Technology: in honour of John Sinclair. Amsterdam: Benjamins, pp. 233–250. <https://doi.org/10.1075/z.64.15bak> [Accessed: 20241220]

Cambridge (2022). Cambridge English Dictionary. <https://dictionary.cambridge.org/dictionary/english> [Accessed: 20241220]

Helsinki NLP (2019). Description of an English–Spanish machine translation system. <https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models/en-es> [Accessed: 20241220]

Helsinki NLP (2020). English–Spanish machine translation model [neural machine translation model]. <https://huggingface.co/Helsinki-NLP/opus-mt-en-es> [Accessed: 20241220]

Hugging Face (2024a). The AI community building the future: The platform where the machine learning community collaborates on models, datasets, and applications. <https://huggingface.co> [Accessed: 20241220]

Hugging Face (2024b). Transformers 4.47.1 [Python library] <https://pypi.org/project/transformers> [Accessed: 20241220]

Jimenez-Crespo, Miguel A. (2023). “Translationese” (and “post-editese”?) no more: on importing fuzzy conceptual tools from Translation Studies in MT research. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pp. 261–268. <https://aclanthology.org/2023.eamt-1.25/> [Accessed: 20241220]

Liber, A.F. (1963). Problems of Translation. Pediatrics v. 32, n. 2. p. 310. <https://doi.org/10.1542/peds.32.2.310> [Accessed: 20241220]

Merriam-Webster (2022). Merriam-Webster English Dictionary. <https://www.merriam-webster.com> [Accessed: 20241220]

Ondoño-Soler, Nerea; Forcada, Mikel L. (2022). The Exacerbation of (Grammatical) Gender Stereotypes in English–Spanish Machine Translation. Tradumàtica v. 20, pp. 176–196. <https://doi.org/10.5565/rev/tradumatica.307> [Accessed: 20241220]

Shannon, Claude (1948). A mathematical theory of communication. The Bell System Technical Journal v. 27, n. 3. pp. 379–423. <https://web.archive.org/web/20121108191018/https://www.alcatel-lucent.com/bstj/vol27-1948/articles/bstj27-3-379.pdf> [Accessed: 20241220]

Toral, Antonio. (2019). Post-editese: an Exacerbated Translationese. In Proceedings of Machine Translation Summit XVII: Research Track, Dublin, Ireland. European Association for Machine Translation, pp. 273–281. <https://aclanthology.org/W19-6627> [Accessed: 20241220]

Tiedemann, Jörg; Thottingal, Santhosh (2020). OPUS-MT – Building open translation services for the World. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, Lisboa, Portugal. European Association for Machine Translation, pages 479–480. <https://aclanthology.org/2020.eamt-1.61> [Accessed: 20241220]

Vanmassenhove, Eva; Shterionov, Dimitar; Way, Andy. (2019). Lost in Translation: Loss and Decay of Linguistic Richness in Machine Translation. In Proceedings of Machine Translation Summit XVII: Research Track, Dublin, Ireland. European Association for Machine Translation, pp. 222–232, <https://aclanthology.org/W19-6622> [Accessed: 20241220]

Vanmassenhove, Eva; Shterionov, Dimitar; Gwilliam, Matthew (2021). Machine Translationese: Effects of Algorithmic Bias on Linguistic Complexity in Machine Translation, in Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, p. 2203–2213 <https://doi.org/10.18653/v1/2021.eacl-main.188> [Accessed: 20241220]

Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Łukasz; Polosukhin, Illia (2017). Attention is all you need. In Guyon, I., von Luxburg; U., Bengio, Y.; Wallach, H.; Fergus, R.; Vishwanathan, S; Garnett, R., (eds.) Advances in Neural Information Processing Systems, v 30. <https://papers.nips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf > [Accessed 20241220]

Volansky, Vered; Ordan, Noam; Wintner, Shuly. (2015). On the features of translationese. Digital Scholarship in the Humanities, v. 30 n. 1, pp. 98–118. <https://academic.oup.com/dsh/article-abstract/30/1/98/350113> [Accessed 20241220]

Wordreference (2022). English–Spanish dictionary. <https://www.wordreference.com> [Accessed 20241220]

Publicades

2024-12-31

Descàrregues