Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models

Fantazzini, Stefano (2024) Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (4MB)

Abstract

The rise of Large Language Models (LLMs) has catalyzed a massive shift in the AI landscape, reshaping how businesses operate. This thesis is positioned at the intersection between cutting-edge AI research and practical implementation, having as its main purpose the development of a small-sized AI model that retains the capabilities of colossal LLMs such as GPT-3, specifically in the context of text summarization. The core of this research goes beyond achieving high performance; it aims to construct a model that is inherently more sustainable. By employing advanced methods like Semi-Automated Labeling and Step-by-Step Chain-of-Thought approaches, our study seeks to extensively investigate the capabilities of LLM knowledge distillation.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Fantazzini, Stefano

Relatore della tesi

Torroni, Paolo

Correlatore della tesi

Ziri, Anna Elisabetta ; Fioravante Alise, Dario ; Ruggeri, Federico

Scuola

Ingegneria e Architettura

Corso di studio