Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models

Fantazzini, Stefano (2024) Efficient Knowledge Distillation for Green NLP Models: Bridging the Gap with Large Language Models. Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]
The rise of Large Language Models (LLMs) has catalyzed a massive shift in the AI landscape, reshaping how businesses operate. This thesis is positioned at the intersection between cutting-edge AI research and practical implementation, having as its main purpose the development of a small-sized AI model that retains the capabilities of colossal LLMs such as GPT-3, specifically in the context of text summarization. The core of this research goes beyond achieving high performance; it aims to construct a model that is inherently more sustainable. By employing advanced methods like Semi-Automated Labeling and Step-by-Step Chain-of-Thought approaches, our study seeks to extensively investigate the capabilities of LLM knowledge distillation.

Fantazzini, Stefano
NLP,LLM,Knowledge Distillation,Chain-of-Thought,CoT,Fine-Tuning,Semi-Automated Labeling,ChatGPT,GPT
19 Marzo 2024

