Learner Corpora and Artificial Intelligence: Towards Error Annotation of a Corpus of Italian EFL Students' Interactions with Chatbots

Paradisi, Arianna (2025) Learner Corpora and Artificial Intelligence: Towards Error Annotation of a Corpus of Italian EFL Students' Interactions with Chatbots. [Laurea magistrale], Università di Bologna, Corso di Studio in Specialized translation [LM-DM270] - Forli'

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 4.0 (CC BY-SA 4.0)
Download (3MB)

Abstract

This thesis is part of the UNITE — Universally inclusive technologies to practice English project, which aims to create and analyse a learner corpus based on interactions between Italian students of English as a Foreign Language (EFL) and chatbots. The thesis specifically presents two case studies, one on error annotation of a sample of texts from the corpus, and another on the possibility of using ChatGPT for automating the error annotation process. The first case study involved the error annotation of students’ conversational turns from 23 texts using the Louvain Error Tagging Manual Version 2.0, which resulted in the refinement of the error taxonomy so that it could align with the conversational nature of the UNITE corpus. Among other results, the distribution of errors annotated using the refined error tagset showed that the corpus presents several features commonly associated with digitally-mediated-communication, with orthographic and morphological errors being the most frequent type of linguistic errors. The second case study consisted of a proof-of-concept experiment where a custom GPT powered by the ChatGPT-4o model was created and used for error annotating four texts from the sample manually annotated corpus. By comparing the GPT’s output with human annotations, results on accuracy revealed that the chatbot was able to reach an acceptable level of accuracy. This means that, even if with due attention, it may be used as a preliminary instrument for error annotation, followed by an accurate revision and post-editing.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Paradisi, Arianna

Relatore della tesi

Ferraresi, Adriano

Correlatore della tesi

Milicevic Petrovic, Maja

Scuola

Lingue e Letterature, Traduzione e Interpretazione

Corso di studio

Specialized translation [LM-DM270] - Forli'

Indirizzo

CURRICULUM SPECIALIZED TRANSLATION

Ordinamento Cds

DM270

Parole chiave

artificial intelligence,large language models,chatbots,language learning,dialogue-based Computer-Assisted Language Learning,learner corpora,corpus annotation,error annotation

Data di discussione della Tesi

18 Marzo 2025

URI

https://amslaurea.unibo.it/id/eprint/34527

Altri metadati

Statistica sui download

Vedi altre statistiche

Gestione del documento:

Strumenti di navigazione

Collezioni AlmaDL

Learner Corpora and Artificial Intelligence: Towards Error Annotation of a Corpus of Italian EFL Students' Interactions with Chatbots

Abstract

Altri metadati

Statistica sui download