Hartsuiker, Jens Matthias
(2023)
Finetuning commercial Large Language Models with LoRA for enhanced Italian language understanding.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
Abstract
In this thesis we took the first steps of creating a well functioning LLM for the Italian language. We finetune two open source commercially licensed LLMs, MPT and LLaMA 2 on an Italian instruction dataset, Stambecco. Although the models do not perform as well as initially aimed for, we did have findings that are broadly applicable and we believe that this work justifies the creation of an LLM pretrained on a majority of Italian data.
Abstract
In this thesis we took the first steps of creating a well functioning LLM for the Italian language. We finetune two open source commercially licensed LLMs, MPT and LLaMA 2 on an Italian instruction dataset, Stambecco. Although the models do not perform as well as initially aimed for, we did have findings that are broadly applicable and we believe that this work justifies the creation of an LLM pretrained on a majority of Italian data.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Hartsuiker, Jens Matthias
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Large Language Models,Transformers,Low Rank Adaptation,PwC,Finetuning,Deep Learning
Data di discussione della Tesi
16 Dicembre 2023
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Hartsuiker, Jens Matthias
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Large Language Models,Transformers,Low Rank Adaptation,PwC,Finetuning,Deep Learning
Data di discussione della Tesi
16 Dicembre 2023
URI
Statistica sui download
Gestione del documento: