Millora de la interpretació consecutiva amb ASR: Sight-Terp com a eina d'interpretació assistida per ordinador
Resum
Aquest estudi experimental investiga l’impacte potencial d’utilitzar el reconeixement automàtic de veu (ASR) i la traducció de veu (ST) en la interpretació consecutiva (CI) utilitzant una eina d’interpretació assistida per ordinador (CAI). L’eina CAI utilitzada en aquest context és “Sight-Terp”, una eina amb suport ASR desenvolupada i dissenyada pel primer autor d’aquest estudi. Sight-Terp ofereix múltiples funcions, com ara ASR, traducció automàtica en temps real, ressaltat d’entitats i segmentació enumerada automàticament. La metodologia de la investigació adopta un disseny intra-subjectes, avaluant el rendiment dels participants en escenaris amb i sense l’ús de Sight-Terp en una tablet. S’han reclutat 12 participants per a l’experiment, i se’ls ha demanat que interpretin quatre discursos en anglès en mode interpretació consecutiva llarga al turc: dos utilitzant Sight-Terp i els altres dos amb paper i bolígraf. L’anàlisi de dades es basa en paràmetres de precisió i fluïdesa. A fi de distingir la variació en la precisió entre tots dos escenaris, les mètriques de precisió s’han fonamentat en la mitjana d’unitats semàntiques correctament interpretades (unitats de significat) segons Seleskovitch (1989). D’altra banda, la fluïdesa s’ha quantificat rastrejant la freqüència de marcadors de disfluència, inclosos falsos inicis, pauses innecessàries, paraules per omplir, repeticions de paraules completes, paraules fragmentades i frases incompletes en cada sessió. Els resultats mostren que la integració d’ASR en dues tasques d’interpretació consecutiva ha millorat la precisió en les interpretacions dels participants. De tota manera, això també ha incrementat la freqüència de disfluències i ha prolongat la durada dels seus rendiments en comparació a les tasques realitzades sense Sight-Terp. Les troballes de l’estudi també suggereixen àrees potencials de millora i modificacions que podrien optimitzar encara més la utilitat de l’eina. Estudis empírics futurs amb Sight-Terp podran oferir més informació sobre la viabilitat de l’ASR en el procés d’interpretació i sobre els aspectes cognitius de la interacció humà-màquina en la interpretació consecutiva.
Paraules clau
interpretació assistida per ordinador, reconeixement automàtic de la parla, tecnologia de la interpretació, interpretació consecutiva, presa de notes, interpretació amb tauletaReferències
Andres, Dörte (2002). Konsekutivdolmetschen und Notation. Frankfurt am Main: Peter Lang. <https://openscience.ub.uni-mainz.de/handle/20.500.12030/1317.>. [Accessed: 20241219].
Bérard, Alexandre; Pietquin, Olivier; Servan, Christophe; Besacier, Laurent (2016). Listen and translate: A proof of concept for end-to-end speech-to-text translation. arXiv preprint, arXiv:1612.01744. <https://arxiv.org/abs/1612.01744.>. [Accessed: 20241219].
Chiu, Chung-Cheng; Sainath, Tara N.; Wu, Yonghui; Prabhavalkar, Rohit; Nguyen, Patrick; Chen, Zhifeng; Kannan, Anjuli; Weiss, Ron J.; Rao, Kanishka; Gonina, Ekaterina; Jaitly, Navdeep; Li, Bo; Chorowski, Jan; Bacchiani, Michiel (2018). State-of-the-art speech recognition with sequence-to-sequence models. In: Yvon, François; Hansen, Viggo (eds.). Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary, AB, Canada: IEEE, pp. 4774-4778. <https://doi.org/10.1109/ICASSP.2018.8462105.>. [Accessed: 20241219].
Coleman, Meri; Liau, T. L. (1975). A computer readability formula designed for machine scoring. Journal of Applied Psychology, v. 60, n. 2, pp. 283–284. <https://doi.org/10.1037/h0076540.>. [Accessed: 20241219].
Cui, Leyang; Wu, Yu; Liu, Jian; Yang, Sen; Zhang, Yue (2021). Template-based named entity recognition using BART. arXiv preprint, arXiv:2106.01760. <https://arxiv.org/abs/2106.01760.>. [Accessed: 20241219].
Defrancq, Bart; Fantinuoli, Claudio (2021). Automatic speech recognition in the booth: Assessment of system performance, interpreters’ performances and interactions in the context of numbers. Target. International Journal of Translation Studies, v. 33, n. 1, pp. 73–102. <https://benjamins.com/catalog/target.19166.def.>. [Accessed: 20241219].
Dillinger, Mike (1994). Comprehension during interpreting: What do interpreters know that bilinguals don’t? In: Lambert, Sylvie; Moser-Mercer, Barbara (eds.). Bridging the Gap: Empirical Research in Simultaneous Interpretation. Amsterdam: John Benjamins, pp. 155–190. <https://doi.org/10.1075/btl.3.14dil.>. [Accessed: 20241219].
Fantinuoli, Claudio (2017a). Speech recognition in the interpreter workstation. In: Proceedings of the 39th Conference Translating and the Computer. London, UK: Editions Tradulex, pp. 25–34. <https://www.staff.uni-mainz.de/fantinuo/download/publications/Speech%20Recognition%20in%20the%20Interpreter%20Workstation.pdf.>. [Accessed: 20241219].
Fantinuoli, Claudio (2017b). Computer-assisted preparation in conference interpreting. Translation & Interpreting: The International Journal of Translation and Interpreting Research, v. 9, n. 2, pp. 24–37. <https://doi.org/10.12807/ti.109202.2017.a02.>. [Accessed: 20241219].
Fantinuoli, Claudio (2018). Interpreting and technology: The upcoming technological turn. In: Fantinuoli, Claudio (ed.). Interpreting and Technology. Berlin: Language Science Press, pp. 1–12. <https://doi.org/10.5281/zenodo.1493289.>. [Accessed: 20241219].
Flesch, Rudolf (1948). A new readability yardstick. Journal of Applied Psychology, v. 32, n. 3, pp. 221–233. <https://doi.org/10.1037/h0057532.>. [Accessed: 20241219].
Gile, Daniel (2009). Basic Concepts and Models for Interpreter and Translator Training. Amsterdam: John Benjamins Publishing Company. <https://doi.org/10.1075/btl.8.>. [Accessed: 20241219].
Gillies, Andrew (2017). Note-taking for Consecutive Interpreting: A Short Course. 2nd ed. London: Routledge. <https://doi.org/10.4324/9781315648996.>. [Accessed: 20241219].
Gunning, Robert (1952). The Technique of Clear Writing. New York: McGraw-Hill.
Hansen-Schirra, Silvia (2012). Nutzbarkeit von Sprachtechnologien für die Translation. trans-kom, v. 5, n. 2, pp. 211–226. <https://www.trans-kom.eu/ihv_05_02_2012.html.>. [Accessed: 20241219].
Herbert, Jean (1952). Manuel de l’interprète: Comment on devient interprète de conférences. Genève: Librairie de l’Université Genève.
Keraghel, Imed; Morbieu, Stanislas; Nadif, Mohamed (2024). A survey on recent advances in named entity recognition. arXiv preprint, arXiv:2401.10825. <https://arxiv.org/abs/2401.10825.>. [Accessed: 20241219].
Kincaid, J. Peter; Fishburne, Robert P. Jr.; Rogers, Richard L.; Chissom, Brad S. (1975). Derivation of New Readability Formulas (Automated Readability Index, Fog Count, and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Chief of Naval Technical Training. <https://doi.org/10.21236/ADA006655.>. [Accessed: 20241219].
Korpal, Paweł; Stachowiak-Szymczak, Katarzyna (2019). Combined problem triggers in simultaneous interpreting: Exploring the effect of delivery rate on processing and rendering numbers. Perspectives, v. 28, n. 1, pp. 126–143. <https://doi.org/10.1080/0907676X.2019.1628285.>. [Accessed: 20241219].
Lickley, Robin J. (2015). Fluency and Disfluency. In: Redford, Melissa A. (ed.). The Handbook of Speech Production. Hoboken, NJ: John Wiley & Sons, pp. 445–474. <https://doi.org/10.1002/9781118584156.ch20.>. [Accessed: 20241219].
Lucas Rafael Stefanel Gris, Diogo Fernandes, Frederico Santos de Oliveira, Anderson da Silva Soares, Telma Woerle de Lima Soares, and Arlindo Rodrigues Galvão (2024). “Automatic Speech-to-Speech Translation of Educational Videos Using SeamlessM4T and Its Use for Future VR Applications.” In Proceedings of the 2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW), Orlando, FL, USA, pp. 163–166. <https://doi.org/10.1109/VRW62533.2024.00033.>. [Accessed: 20241219].
McLaughlin, G. Harry (1969). SMOG grading: A new readability formula. Journal of Reading, v. 12, n. 8, pp. 639–646. <http://www.jstor.org/stable/40011226.>. [Accessed: 20241219]..
Müller, Markus; Nguyen, Thai Son; Niehues, Jan; Cho, Eunah; Krüger, Bastian; Ha, Thanh-Le; Kilgour, Kevin; Sperber, Matthias; Mediani, Mohammed; Stüker, Sebastian; Waibel, Alex (2016). Lecture Translator – Speech translation framework for simultaneous lecture translation. In: DeNero, John; Finlayson, Mark; Reddy, Sravana (eds.). Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. San Diego, California: Association for Computational Linguistics, pp. 82–86. <https://doi.org/10.18653/v1/N16-3017.>. [Accessed: 20241219]. .
Nozaki, Jumon; Kawahara, Tatsuya; Ishizuka, Kenkichi; Hashimoto, Taiichi (2022). End-to-End Speech-to-Punctuated-Text Recognition. arXiv preprint. <https://arxiv.org/abs/2207.03169.>. [Accessed: 20241219]..
Orlando, Marc (2014). A study on the amenability of digital pen technology in a hybrid mode of interpreting: Consec-simul with notes. Translation and Interpreting, v. 6, n. 2, pp. 39–54. https://doi.org/10.12807/ti.106202.2014.a03.
Pisani, Elisabetta; Fantinuoli, Claudio (2021). Measuring the Impact of Automatic Speech Recognition on Number Rendition in Simultaneous Interpreting. In: Wang, Caiwen; Zheng, Binghan (eds.). Empirical Studies of Translation and Interpreting: The Post-Structuralist Approach. 1st ed. London: Routledge, pp. 181–197. <https://doi.org/10.4324/9781003017400-14.>. [Accessed: 20241219].
Prandi, Bianca (2023). Computer-Assisted Simultaneous Interpreting: A Cognitive-Experimental Study on Terminology. Berlin: Language Science Press. <https://langsci-press.org/catalog/book/348.>. [Accessed: 20241219].
Rodríguez González, Elena; Saeed, Muhammad Asad; Korybski, Tomasz; Davitti, Elena; Braun, Sabine (2023). Assessing the Impact of Automatic Speech Recognition on Remote Simultaneous Interpreting Performance Using the NTR Model. In: Corpas Pastor, Gloria; Hidalgo-Ternero, Carlos Manuel. (eds.). Proceedings of the International Workshop on Interpreting Technologies - SAY IT AGAIN 2023, Málaga, Spain, 2–3 November 2023, pp. 177–186.
Rodriguez, Susana; Gretter, Roberto; Matassoni, Marco; Alonso, Alvaro; Corcho, Oscar; Rico, Mariano; Falavigna, Daniele (2021). SmarTerp: A CAI System to Support Simultaneous Interpreters in Real-Time. In: Mitkov, Ruslan; Sosoni, Vilelmini; Giguère, Julie Christine; Murgolo, Elena; Deysel, Elizabeth (eds.). Proceedings of the Translation and Interpreting Technology Online Conference (TRITON 2021). Held Online: INCOMA Ltd., pp. 102–109. <https://aclanthology.org/2021.triton-1.12.>. [Accessed: 20241219].
Rozan, Jean-François (1956). La prise de notes en interprétation consécutive. Genève: Librairie de l’Université.
Saboo, Ashutosh; Baumann, Timo (2019). Integration of Dubbing Constraints into Machine Translation. In: Bojar, Ondřej; Chatterjee, Rajen; Federmann, Christian; Fishel, Mark; Graham, Yvette; Haddow, Barry; Huck, Matthias; Jimeno Yepes, Antonio; Koehn, Philipp; Martins, André; Monz, Christof; Negri, Matteo; Névéol, Aurélie; Neves, Mariana; Post, Matt; Turchi, Marco; Verspoor, Karin (eds.). Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers). Florence, Italy: Association for Computational Linguistics, pp. 94–101. <https://aclanthology.org/W19-5210.>. [Accessed: 20241219].
Seleskovitch, Danica; Lederer, Marianne (1989). Pédagogie raisonnée de l’interprétation. Paris: Didier Érudition/OPOCE.
Smith, E. A.; Senter, R. J. (1967). Automated Readability Index. AMRL-TR-66-220. Wright-Patterson Air Force Base, Ohio: Aerospace Medical Research Laboratories, Aerospace Medical Division, Air Force Systems Command.
Tjong Kim Sang, Erik F.; De Meulder, Fien (2003). Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Daelemans, Walter; Osborne, Miles (eds.). Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. Edmonton, Canada: Association for Computational Linguistics, pp. 142–147. <https://doi.org/10.3115/1119176.1119195.>. [Accessed: 20241219].
Tommola, Jorma; Helevä, Marketta (1998). Language Direction and Source Text Complexity Effects on Trainee Performance in Simultaneous Interpreting. In: Bowker, Lynne; Cronin, Michael; Kenny, Dorothy; Pearson, Jennifer (eds.). Unity in Diversity: Current Trends in Translation Studies. Manchester: St. Jerome Publishing, pp. 177–186.
Van Cauwenberghe, Goran (2020). La reconnaissance automatique de la parole en interprétation simultanée: étude expérimentale de l’impact d’un soutien visuel automatisé sur la restitution de terminologie spécialisée. [Master’s thesis], Ghent University. Ghent. <https://lib.ugent.be/catalog/rug01:002862551.>. [Accessed: 20241219].
Wang, Xinyu; Wang, Caiwen (2019). Can Computer-Assisted Interpreting Tools Assist Interpreting? Transletters: International Journal of Translation and Interpreting, 3, 109–139. <https://journals.uco.es/tl/article/view/11575.>. [Accessed: 20241219].
Weiss, Ron J.; Chorowski, Jan; Jaitly, Navdeep; Wu, Yonghui; Chen, Zhifeng (2017). Sequence-to-Sequence Models Can Directly Translate Foreign Speech. arXiv preprint, arXiv:1703.08581. <https://arxiv.org/abs/1703.08581.
Xiong, Wayne; Wu, Lingfeng; Alleva, Frank; Droppo, Jeffrey; Huang, Xuedong; Stolcke, Andreas (2018). The Microsoft 2017 Conversational Speech Recognition System. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary, AB, Canada: IEEE, pp. 5934–5938. https://doi.org/10.1109/ICASSP.2018.8461870.>. [Accessed: 20241219].
Zhang, Yu; Park, Daniel S.; Han, Wei; Qin, James; Gulati, Anmol; Shor, Joel; Jansen, Aren; Xu, Yuanzhong; Huang, Yanping; Wang, Shibo; Zhou, Zongwei; Li, Bo; Ma, Min; Chan, William; Yu, Jiahui; Wang, Yongqiang; Cao, Liangliang; Sim, Khe Chai; Ramabhadran, Bhuvana; Sainath, Tara N.; Beaufays, Françoise; Chen, Zhifeng; Le, Quoc V.; Chiu, Chung-Cheng; Pang, Ruoming; Wu, Yonghui (2022). BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE Journal of Selected Topics in Signal Processing, 16(6), 1519–1532. <https://doi.org/10.1109/JSTSP.2022.3182537. .>. [Accessed: 20241219].
Publicades
Com citar
Descàrregues
Drets d'autor (c) 2024 Cihan Ünlü, Aymil Doğan

Aquesta obra està sota una llicència internacional Creative Commons Reconeixement 4.0.