InLéctor:自动创建双语电子书

作者

  • Antoni Oliver González Universitat Oberta de Catalunya

摘要

本文介绍了一个自动创建平行文本的双语书籍的系统。该系统允许创立电子书籍,其中源语言句子和目标语言句子相对应。用户可以阅读原文并且通过单击原文句子查看相应的译文。通过这种方式,用户可以继续阅读译文或者通过单击任何句子回到原文。源语言书籍的每句话与目标语言翻译自动对齐。该系统不使用机器翻译,而是使用原著已经出版的目标语言译本。我们用经典小说及其公有领域的译本建立了几本双语电子书。然而,该系统同样也可用于对齐任何一本用户拥有版权的原著及其译本。该系统针对的是那些想要用中高级语言水平阅读原著的读者。本文还介绍了利用免费的词汇资源建立双语词典的流程。这两种资源,不论是电子书还是还是双语词典,对于那些想要阅读原著的读者来说,都是非常有帮助的。

关键词

电子书;平行文本;阅读支持

参考

Bamford, J.; Day, R. R. (1997). “Extensive reading: What is it? Why bother?” The Language Teacher, v. 21, n. 5. p. 6-8. <http://jalt-publications.org/tlt/articles/2132-extensive-reading-what-it-why-bother>. [Consulted: January 3, 2017].

Bird, S. (2006). “NLTK: the natural language toolkit”. In: Proceedings of the COLING/ACL on Interactive presentation sessions, Association for Computational Linguistics, p. 69-72. <https://dl.acm.org/citation.cfm?id=1225403&picked=prox>,

. [Consulted: April 15, 2017].

Bond, F.; Kyonghee, P. (2012). “A Survey of WordNets and their Licenses”. In: Proceedings of the 6th International Global WordNet Conference, Matsue, Japan, p. 64-71. <https://research.vu.nl/en/publications/proceedings-of-the-6th-global-wordnet-conference-matsue-japan>. [Consulted: October 11, 2017].

Chin-Neng, C. [et al.] (2013). “The effects of extensive reading via e-books on tertiary level efl students’ reading attitude, reading comprehension and vocabulary”. TOJET: The Turkish Online Journal of Educational Technology, v. 12, n. 2 (April).<http://www.tojet.net/articles/v12i2/12228.pdf> [Consulted: April 15, 2017].

Ernst-Slavit, G.; Mulhern, M. (2003). “Bilingual books: Promoting literacy and biliteracy in the second-language and mainstream classroom”. Reading online, v. 7, n. 2, p. 1096-1232. <https://ncela.ed.gov/rcd/bibliography/BE022179>. [Consulted: January 3, 2017].

Fellbaum, C. (ed.) (1998). WordNet: An electronic lexical database. London: The MIT Press. (Language, speech and communication).

Forcada, M. L. [et al.] (2011). “Apertium: a free/open-source platform for rule-based machine translation”. Machine translation, v. 25, n. 2, p. 127-144. . [Consulted: October 11, 2017].

Hafiz, F. M.; Tudor, I. (1989). “Extensive reading and the development of language skills”, ELT journal, v. 43, n. 1, p. 4-13. <https://doi.org/10.1093/elt/43.1.4>. [Consulted: September 17, 2017].

Nuttall, C. (1996). Teaching reading skills in a foreign language. London: Heinemann.

Iribarren, T.; Oliver, A.; Peiró, E. (2016). “Recuperar traduccions inèdites per a internautes: el cas de L’illa dels pingüins, d’Anatole France, en traducció de J. F. Vidal Jové”. In: Bacardí, M.; Godayol, P. (eds). Traducció i franquisme. Lleida: Punctum. (Visions; 8).

Kiss, T.; Strunk, J. (2006). “Unsupervised multilingual sentence boundary detection”. Computational Linguistics, v. 32, n. 4 (December), p. 485-525. . [Consulted: September 18, 2017].

Krashen, S. D. (2004). The Power of Reading: Insights from the Research: Insights from the Research. 2nd ed. Santa Barbara, California: ABC-CLIO.

Lehmann, J. [et al.] (2015). “DBpedia: A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia”. Semantic Web Journal, v. 6, n. 2, p. 167-195. . [Consulted: September 18, 2017].

Padró, L.; Stanilovsky E. (2012). “FreeLing 3.0: Towards Wider Multilinguality”. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012), ELRA, Istanbul, Turkey, p. 2473-2479. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/430_Paper.pdf>. [Consulted: September 18, 2017].

Prowse, P. (2002). “Top ten principles for teaching extensive reading: A response”. Reading in a Foreign Language, v. 14, n. 2 (October), p. 136-141. <http://nflrc.hawaii.edu/rfl/October2002/day/day.html>. [Consulted: September 17, 2017].

Semingson, P.; Pole, K.; Tommerdahl, J. (2015). “Using Bilingual Books to Enhance Literacy around the World”. European Scientific Journal, v. 3 (February), p. 132-139. <http://eujournal.org/index.php/esj/article/view/5216/5014>. [Consulted: October 22, 2017].

Sérasset, G. (2015). “DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF”. Semantic Web Journal, v. 6, n. 4, p. 355-361. <http://www.semantic-web-journal.net/system/files/swj648.pdf>, . [Consulted: October 22, 2017].

Tiedemann, J. (2012). “Parallel Data, Tools and Interfaces in OPUS”. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), European Language Resources Association (ELRA), Istanbul, Turkey, may, p. 2214-2218. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf>. [Consulted: September 18, 2017].

Varga, D. [et al.] (2007). “Parallel corpora for medium density languages”. Amsterdam Studies in the Theory and History of Linguistic Science Series IV, v. 292. <https://catalog.ldc.upenn.edu/docs/LDC2008T01/ranlp05.pdf>. [Consulted: October 22, 2017].

Zanettin, F. (2002). "Corpora in Translation Practice". In: Yuste-Rodrigo, E. (ed.). Language Resources for the Translator Work and Research. LREC 2002 Workshop Proceedings, pp. 10-14. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.136.5669&rep=rep1&type=pdf>. [Consulted: September 18, 2017]

Zesch, T.; M̈uller, C.; Gurevych, I. (2008). “Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary”. LREC, v. 8, p. 1646-1652. <https://pdfs.semanticscholar.org/065a/29adca32f66c16005de3f48ebb3512c8baf1.pdf>. [Consulted: September 18, 2017].

Author Biography

Antoni Oliver González, Universitat Oberta de Catalunya

Professor dels estudis d'Arts i Humanitats de la Universitat Oberta de Catalunya  i director del màster en Traducció Especialitzada.

已出版

2017-12-28

Downloads