Finetuning commercial Large Language Models with LoRA for enhanced Italian language understanding

Hartsuiker, Jens Matthias (2023) Finetuning commercial Large Language Models with LoRA for enhanced Italian language understanding. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[img] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Condividi allo stesso modo 4.0 (CC BY-SA 4.0)

Download (1MB)

Abstract

In this thesis we took the first steps of creating a well functioning LLM for the Italian language. We finetune two open source commercially licensed LLMs, MPT and LLaMA 2 on an Italian instruction dataset, Stambecco. Although the models do not perform as well as initially aimed for, we did have findings that are broadly applicable and we believe that this work justifies the creation of an LLM pretrained on a majority of Italian data.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Hartsuiker, Jens Matthias
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Large Language Models,Transformers,Low Rank Adaptation,PwC,Finetuning,Deep Learning
Data di discussione della Tesi
16 Dicembre 2023
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^