Example Sentence Suggestion for Learners of Japanese as a Second Language Using Pretrained Language Models

Benedetti, Enrico (2024) Example Sentence Suggestion for Learners of Japanese as a Second Language Using Pretrained Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento ad accesso riservato.
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Full-text non accessibile fino al 22 Giugno 2024.
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Condividi allo stesso modo 4.0 (CC BY-NC-SA 4.0)

Download (680kB) | Contatta l'autore

Abstract

In this thesis, we tackle the challenge of proposing diverse example sentences to learners of Japanese that are tailored to their proficiency level. Trying to address the lack of work using Pretrained Language Models (PLMs) on this specific task and expanding in new directions, we develop and compare different paradigms. First, we propose employing PLMs as quality scoring components of a retrieval system, retrieving from a newly curated corpus of Japanese sentences from varied sources. Second, we directly leverage PLMs as sentence generators through 0-shot learning. Then, we evaluate the quality of suggested sentences by considering multiple aspects such as difficulty, diversity, and naturalness, with a panel of raters consisting of learners of Japanese, native speakers -- and GPT-4. The experimental results suggest that there is inherent disagreement among participants on the ratings of sentence qualities, except for difficulty ratings. Despite the variability, the retrieval approach was preferred by all the evaluators especially when focusing on beginner and advanced target difficulty, suggesting there is potential for using PLMs to enhance the adaptability of sentence suggestion systems to better suit learners during their journey.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Benedetti, Enrico
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Pretrained Language Models,Second Language Learning,Japanese,Example Sentences,Retrieval,Generation
Data di discussione della Tesi
19 Marzo 2024
URI

Altri metadati

Gestione del documento: Visualizza il documento

^