Investigating the use of readability metrics to detect differences in written productions of learners: a corpus-based study



This paper deals with the use of readability metrics as indices of learmers' linguistic features in a written corpus of Spanish learners of English L2. Seventeen measures of readability are presented and computed for 200 samples of written argumentative essays extracted from the corpus NOCE (Díaz-Negrillo, 2007). Support Vector Machines (SVM) are used in order to detect which are the metrics that perform better at detecting differences in learners’ productions belonging to students enrolled in the first or in the second year of an English major. Metrics based on sentence length, number of sentences, and number of polysyllabic words are reported to be the most accurate ones for the classification of learners' linguistic features.



readability, learner corpora, SVM, written essays

Author Biography

Paula Lissón, Université Paris Diderot (USPC)

Graduate student at Paris Diderot University (USPC), department of English Linguistics.




Download data is not yet available.