Matrix factorization techniques for Large Language Models

Pandini, Simone (2024) Matrix factorization techniques for Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Matematica [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (6MB)

Abstract

In the last years the development of Large Language Models (LLMs) has revolutionized the field of natural language processing (NLP), enabling significant advancements in several contexts, such as text translation, code generation and question answering. To address all these tasks, LLMs have become increasingly complex and resource-intensive, since they require an extensive training on a huge amount of data, mostly in English, that need to be preprocessed. Since the most famous LLMs are meant to be general purpose, a fine-tuning procedure is usually needed to tailor the models on specific domains. However, updating all parameters, would be computationally expensive and would require significant memory resources. This brought to the exploration of parameter-efficient fine-tuning (PEFT) methods that allow to modify only a small subset of parameters while keeping the majority fixed. This approach not only reduces the computational effort but also minimizes the risk of catastrophic forgetting, particularly when working with limited task-specific data. Additionally, compared to training from scratch, fine-tuning may be achieved with fewer labeled instances and less computing resources by utilizing the knowledge already present in these huge models. This method improves the effectiveness of implementing LLMs in practical applications while also democratizing access to cutting-edge AI capabilities. All the most important PEFT methods rely on different types of matrix factorization, such as low-rank, sparse or Singular Value Decomposition in order to decompose the trainable fine-tuning matrix. These factorizations contain fewer parameters than full fine-tuning, allowing to significantly decrease training time and computational resources without affecting the overall model’s performance.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Pandini, Simone

Relatore della tesi

Palitta, Davide

Correlatore della tesi

Porcelli, Margherita ; Brandoni, Domitilla

Scuola

Scienze

Corso di studio