Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models

Fantazzini, Stefano (2024) Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
[thumbnail of Thesis] Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)

Download (4MB)

Abstract

The rise of Large Language Models (LLMs) has catalyzed a massive shift in the AI landscape, reshaping how businesses operate. This thesis is positioned at the intersection between cutting-edge AI research and practical implementation, having as its main purpose the development of a small-sized AI model that retains the capabilities of colossal LLMs such as GPT-3, specifically in the context of text summarization. The core of this research goes beyond achieving high performance; it aims to construct a model that is inherently more sustainable. By employing advanced methods like Semi-Automated Labeling and Step-by-Step Chain-of-Thought approaches, our study seeks to extensively investigate the capabilities of LLM knowledge distillation.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Fantazzini, Stefano
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
NLP,LLM,Knowledge Distillation,Chain-of-Thought,CoT,Fine-Tuning,Semi-Automated Labeling,ChatGPT,GPT
Data di discussione della Tesi
19 Marzo 2024
URI

Altri metadati

Statistica sui download

Gestione del documento: Visualizza il documento

^