Fantazzini, Stefano
(2024)
Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
Abstract
The rise of Large Language Models (LLMs) has catalyzed a massive shift in the AI landscape, reshaping how businesses operate. This thesis is positioned at the intersection between cutting-edge AI research and practical implementation, having as its main purpose the development of a small-sized AI model that retains the capabilities of colossal LLMs such as GPT-3, specifically in the context of text summarization.
The core of this research goes beyond achieving high performance; it aims to construct a model that is inherently more sustainable.
By employing advanced methods like Semi-Automated Labeling and Step-by-Step Chain-of-Thought approaches, our study seeks to extensively investigate the capabilities of LLM knowledge distillation.
Abstract
The rise of Large Language Models (LLMs) has catalyzed a massive shift in the AI landscape, reshaping how businesses operate. This thesis is positioned at the intersection between cutting-edge AI research and practical implementation, having as its main purpose the development of a small-sized AI model that retains the capabilities of colossal LLMs such as GPT-3, specifically in the context of text summarization.
The core of this research goes beyond achieving high performance; it aims to construct a model that is inherently more sustainable.
By employing advanced methods like Semi-Automated Labeling and Step-by-Step Chain-of-Thought approaches, our study seeks to extensively investigate the capabilities of LLM knowledge distillation.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Fantazzini, Stefano
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
NLP,LLM,Knowledge Distillation,Chain-of-Thought,CoT,Fine-Tuning,Semi-Automated Labeling,ChatGPT,GPT
Data di discussione della Tesi
19 Marzo 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Fantazzini, Stefano
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
NLP,LLM,Knowledge Distillation,Chain-of-Thought,CoT,Fine-Tuning,Semi-Automated Labeling,ChatGPT,GPT
Data di discussione della Tesi
19 Marzo 2024
URI
Statistica sui download
Gestione del documento: