InLéctor: automatic creation of bilingual e-books to promote reading in a foreign language
Abstract
In this paper a system for the automatic creation of parallel bilingual electronic books is presented. The system allows creating e-books, where source sentences are linked with the corresponding target sentences. Users can read in the original, and clicking on a given sentence, the corresponding sentence in the target language will be shown. Then she or he can continue reading the translation and coming back to the original version clicking in a target language sentence. The source language book is automatically aligned at the sentence level with the target language translation of the book. This system is not using a machine translation system, but instead, it shows the real translation. We have created several bilingual e-books using classic novels and its translations in the public domain, but the same system can be used for any book, provided you have the rights for the original and the translation. The system is aimed to people willing to read in the original, having a mid-high level in the language. We also present the process of creation of bilingual dictionaries from free lexical resources. Both resources, the bilingual e-book and the bilingual dictionary can be of great help for readers willing to read books in the original version.Keywords
e-books, parallel texts, reading aidReferences
Bamford, J.; Day, R. R. (1997). “Extensive reading: What is it? Why bother?” The Language Teacher, v. 21, n. 5. p. 6-8. <http://jalt-publications.org/tlt/articles/2132-extensive-reading-what-it-why-bother>. [Consulted: January 3, 2017].
Bird, S. (2006). “NLTK: the natural language toolkit”. In: Proceedings of the COLING/ACL on Interactive presentation sessions, Association for Computational Linguistics, p. 69-72. <https://dl.acm.org/citation.cfm?id=1225403&picked=prox>,
. [Consulted: April 15, 2017].
Bond, F.; Kyonghee, P. (2012). “A Survey of WordNets and their Licenses”. In: Proceedings of the 6th International Global WordNet Conference, Matsue, Japan, p. 64-71. <https://research.vu.nl/en/publications/proceedings-of-the-6th-global-wordnet-conference-matsue-japan>. [Consulted: October 11, 2017].
Chin-Neng, C. [et al.] (2013). “The effects of extensive reading via e-books on tertiary level efl students’ reading attitude, reading comprehension and vocabulary”. TOJET: The Turkish Online Journal of Educational Technology, v. 12, n. 2 (April).<http://www.tojet.net/articles/v12i2/12228.pdf> [Consulted: April 15, 2017].
Ernst-Slavit, G.; Mulhern, M. (2003). “Bilingual books: Promoting literacy and biliteracy in the second-language and mainstream classroom”. Reading online, v. 7, n. 2, p. 1096-1232. <https://ncela.ed.gov/rcd/bibliography/BE022179>. [Consulted: January 3, 2017].
Fellbaum, C. (ed.) (1998). WordNet: An electronic lexical database. London: The MIT Press. (Language, speech and communication).
Forcada, M. L. [et al.] (2011). “Apertium: a free/open-source platform for rule-based machine translation”. Machine translation, v. 25, n. 2, p. 127-144. . [Consulted: October 11, 2017].
Hafiz, F. M.; Tudor, I. (1989). “Extensive reading and the development of language skills”, ELT journal, v. 43, n. 1, p. 4-13. <https://doi.org/10.1093/elt/43.1.4>. [Consulted: September 17, 2017].
Nuttall, C. (1996). Teaching reading skills in a foreign language. London: Heinemann.
Iribarren, T.; Oliver, A.; Peiró, E. (2016). “Recuperar traduccions inèdites per a internautes: el cas de L’illa dels pingüins, d’Anatole France, en traducció de J. F. Vidal Jové”. In: Bacardí, M.; Godayol, P. (eds). Traducció i franquisme. Lleida: Punctum. (Visions; 8).
Kiss, T.; Strunk, J. (2006). “Unsupervised multilingual sentence boundary detection”. Computational Linguistics, v. 32, n. 4 (December), p. 485-525. . [Consulted: September 18, 2017].
Krashen, S. D. (2004). The Power of Reading: Insights from the Research: Insights from the Research. 2nd ed. Santa Barbara, California: ABC-CLIO.
Lehmann, J. [et al.] (2015). “DBpedia: A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia”. Semantic Web Journal, v. 6, n. 2, p. 167-195. . [Consulted: September 18, 2017].
Padró, L.; Stanilovsky E. (2012). “FreeLing 3.0: Towards Wider Multilinguality”. In: Proceedings of the Language Resources and Evaluation Conference (LREC 2012), ELRA, Istanbul, Turkey, p. 2473-2479. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/430_Paper.pdf>. [Consulted: September 18, 2017].
Prowse, P. (2002). “Top ten principles for teaching extensive reading: A response”. Reading in a Foreign Language, v. 14, n. 2 (October), p. 136-141. <http://nflrc.hawaii.edu/rfl/October2002/day/day.html>. [Consulted: September 17, 2017].
Semingson, P.; Pole, K.; Tommerdahl, J. (2015). “Using Bilingual Books to Enhance Literacy around the World”. European Scientific Journal, v. 3 (February), p. 132-139. <http://eujournal.org/index.php/esj/article/view/5216/5014>. [Consulted: October 22, 2017].
Sérasset, G. (2015). “DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF”. Semantic Web Journal, v. 6, n. 4, p. 355-361. <http://www.semantic-web-journal.net/system/files/swj648.pdf>, . [Consulted: October 22, 2017].
Tiedemann, J. (2012). “Parallel Data, Tools and Interfaces in OPUS”. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), European Language Resources Association (ELRA), Istanbul, Turkey, may, p. 2214-2218. <http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf>. [Consulted: September 18, 2017].
Varga, D. [et al.] (2007). “Parallel corpora for medium density languages”. Amsterdam Studies in the Theory and History of Linguistic Science Series IV, v. 292. <https://catalog.ldc.upenn.edu/docs/LDC2008T01/ranlp05.pdf>. [Consulted: October 22, 2017].
Zanettin, F. (2002). "Corpora in Translation Practice". In: Yuste-Rodrigo, E. (ed.). Language Resources for the Translator Work and Research. LREC 2002 Workshop Proceedings, pp. 10-14. <http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.136.5669&rep=rep1&type=pdf>. [Consulted: September 18, 2017]
Zesch, T.; M̈uller, C.; Gurevych, I. (2008). “Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary”. LREC, v. 8, p. 1646-1652. <https://pdfs.semanticscholar.org/065a/29adca32f66c16005de3f48ebb3512c8baf1.pdf>. [Consulted: September 18, 2017].
Published
Downloads
Copyright (c) 2017 Antoni Oliver González

This work is licensed under a Creative Commons Attribution 4.0 International License.