Using ChatGPT as an AWE tool: quality, precision, and accuracy of the feedback

Authors

Abstract

The emergence of generative pretrained transformer (GPT) large language models (LLMs) like ChatGPT has prompted speculation about their potential to serve as reliable Automated writing evaluation (AWE) tools and provide corrective feedback to second language (L2) writers. Given the novelty of this tool, research on this topic is scarce. Therefore, it is imperative to assess its appropriateness as an AWE tool before its implementation in real learning settings. To help fill this research gap, the current study employs both quantitative and qualitative methods to evaluate the quality, precision, and accuracy of the feedback generated by a customized GPT functioning as an AWE tool for 30 compositions in English. The results indicate that while the general accuracy rate is high and the tool can provide feedback on both form and content, there are occasional instances of erroneous feedback or fabricated mistakes. The educational implications of these findings are discussed.

Keywords

L2 writing skills, ChatGPT, AWE, Error detection, Corrective feedback

References

Aldosemani, T. I., Assalahi, H., Lhothali, A., & Albsisi, M. (2023). Automated writing evaluation in EFL contexts: A review of effectiveness, impact, and pedagogical implications. International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT), 13(1), 1–19. https://doi.org/10.4018/IJCALLT.329962

Allen, L., Likens, A., & McNamara, D. (2018). A multi-dimensional analysis of writing flexibility in an automated writing evaluation system. In Proceedings of the 8th International Conference on Learning Analytics and Knowledge (pp. 380–388). https://doi.org/10.1145/3170358.3170404

Bai, L., & Hu, G. (2017). In the face of fallible AWE feedback: How do students respond? Educational Psychology, 37(1), 67–81. https://doi.org/10.1080/01443410.2016.1223275

Biswas, S. (2023). Role of ChatGPT in education. SSRN. https://ssrn.com/abstract=4369981

Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine, 25(3), 27–36. https://doi.org/10.1609/aimag.v25i3.1774

Cheng, X., & Zhang, J. (2021). Sustaining university English as a foreign language learners’ writing performance through the provision of comprehensive written corrective feedback. Sustainability, 13(15), 1–19. https://doi.org/10.3390/su13158192

Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y. S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to student? A case study on ChatGPT. EdArXiv Preprints. https://doi.org/10.35542/osf.io/hcgzj

Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(57). https://doi.org/10.1186/s41239-023-00425-2

Febriani, T. N. (2022). “Writing is challenging”: Factors contributing to undergraduate students’ difficulties in writing English essays. Erudita: Journal of English Language Teaching, 2(1), 83–93. https://doi.org/10.28918/erudita.v2i1.5441

Fitria, T. N. (2023). Artificial intelligence (AI) technology in OpenAI ChatGPT application: A review of ChatGPT in writing English essay. ELT Forum: Journal of English Language Teaching, 12(1), 44–58. https://doi.org/10.15294/elt.v12i1.64069

Gao, J. (2021). Exploring the feedback quality of an automated writing evaluation system Pigai. International Journal of Emerging Technologies in Learning, 16(11), 322–330. https://doi.org/10.3991/ijet.v16i11.19657

Godwin-Jones, R. (2021). Big data and language learning: Opportunities and challenges. Language Learning & Technology, 25(1), 4–19. http://hdl.handle.net/10125/44747

Goestina, G., Mayasari, K. R., & Nugrawati, S. (2022). Applying digital storytelling to improve students’ writing narrative text learning. BATARA DIDI: English Language Journal, 1(1), 1–11. https://doi.org/10.56209/badi.v1i1.14

Grassini, S. (2023). Shaping the future of education: Exploring the potential and consequences of AI and ChatGPT in educational settings. Education Sciences, 13(692), 1–13. https://doi.org/10.3390/educsci13070692

Gravel, J., D’Amours-Gravel, M., & Osmanlliu, E. (2023). Learning to fake it: Limited responses and fabricated references provided by ChatGPT for medical questions. Mayo Clinic Proceedings: Digital Health, 1(3), 226–234. https://doi.org/10.1016/j.mcpdig.2023.05.004

Hoover, B., Lytvyn, M., & Shevchenko, O. (2015). Systems and methods for advanced grammar checking (Patent No. US 9002700 B2). U.S. Patent and Trademark Office.

Hyland, K. (2019). Second language writing. Cambridge University Press.

Kalla, D., Smith, N., Samaah, F., & Kuraku, S. (2023). Study and analysis of ChatGPT and its impact on different fields of study. International Journal of Innovative Science and Research Technology, 8(3), 827–833. https://doi.org/10.5281/zenodo.7767675

Lo, C. K. (2023). What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences, 13(4), 410. https://doi.org/10.3390/educsci13040410

Lozano, C., Díaz-Negrillo, A., & Callies, M. (2020). Designing and compiling a learner corpus of written and spoken narratives: COREFL. In C. Bongartz & J. Torregrossa (Eds.), What’s in a narrative? Variation in storytelling at the interface between language and literacy (pp. 21–46). Peter Lang. https://doi.org/10.3726/978-3-653-05182-7

Martínez-Carrasco, R., & Chabert, A. (2023). Writing on steroids? Accuracy of automatic corrective feedback in L2 competence development. Bellaterra Journal of Teaching & Learning Language & Literature, 16(3), 1–25. https://doi.org/10.5565/rev/jtl3.1142

Nguyen, H. (2023). EFL teachers’ perspectives toward the use of ChatGPT in writing classes: A case study at Van Lang University. International Journal of Language Instruction, 2(3), 1–47. https://doi.org/10.54855/ijli.23231

OpenAI. (2023a, November 21). Creating a GPT. OpenAI. https://help.openai.com/en/articles/8554397-creating-a-gpt

OpenAI. (2023b, November 21). Does ChatGPT tell the truth? OpenAI. https://help.openai.com/en/articles/8313428-does-chatgpt-tell-the-truth

Ranalli, J., & Hegelheimer, V. (2022). Introduction to the special issue on automated writing evaluation. Language Learning & Technology, 26(2), 1–4. https://doi.org/10125/73473

Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752

Tang, J., & Rich, C. S. (2017). Automated writing evaluation in an EFL setting: Lessons from China. The JALT CALL Journal, 13(2), 117–146. https://doi.org/10.29140/jaltcall.v13n2.215

Wang, S. (2021). Understanding the effects of automated writing corrective feedback on L2 writing accuracy across proficiency levels. Frontiers in Educational Research, 4(11), 94–98. https://doi.org/10.25236/FER.2021.041117

Yu, H. (2023). Reflection on whether Chat GPT should be banned by academia from the perspective of education and teaching. Frontiers in Psychology, 14, 1181712. https://doi.org/10.3389/fpsyg.2023.1181712

Yusuf, K., & Jazilah, N. (2020). Exploring creativity in English writing by using Instagram: University students’ perceptions. Pedagogy: Journal of English Language Teaching, 8(2), 80–88. https://doi.org/10.32332/pedagogy.v8i1.2069

Author Biography

Arsema Pérez Hernández, Universidad de La Rioja

Is a PhD student at the University of La Rioja with an FPU contract. Her research focuses on individual differences in second and third language acquisition. She has presented her work at several international conferences and published in prestigious journals.

Published

2025-09-19

How to Cite

Pérez Hernández, A. (2025). Using ChatGPT as an AWE tool: quality, precision, and accuracy of the feedback. Bellaterra Journal of Teaching & Learning Language & Literature, 18(3), e1338. https://doi.org/10.5565/rev/jtl3.1338

Downloads

Download data is not yet available.