STRUMENTI DI NAVIGAZIONE

Retrieving and incorporating external knowledge into compressed Large Language Models

Marasi, Simone (2023) Retrieving and incorporating external knowledge into compressed Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Large Language Models (LLMs) have recently become a crucial resource due to their ability to generate human-like text in almost any language, but ethical and practical concerns have arisen. These include data privacy, the accuracy of model predictions, and the ecological footprint of large-scale model training. Techniques such as Retrieval Augmented Generation (RAG) and finetuning empower pretrained language models to acquire specific knowledge on a given topic from a proprietary knowledge base trying to overcome these challenges. This can be useful for industrial purposes where data confidentiality and the absence of hallucinations in responses are crucial. In this thesis, we implemented efficient quantisation and compression techniques to make it possible to significantly reduce the number of parameters involved in loading or finetuning LLMs without substantially compromising performance. This reduction in parameter size translates into substantial gains in terms of resource efficiency, both in terms of time and hardware requirements. Furthermore, with compression techniques, we finetuned models with 13 billion parameters like LLaMA-2, even on GPU with just 15 GB of memory. We also integrated RAG techniques, which represent a noteworthy stride forward since they enable models to be enriched directly with external knowledge by way of vector databases. These databases can efficiently embed different sources in the form of texts and knowledge graphs, exploiting their intrinsic structural strengths. By leveraging RAG and incorporating external knowledge, we demonstrated substantial enhancements in retrieval result quality, promising more contextually relevant and accurate responses, and compared different retrieval techniques involving knowledge graphs. These advancements democratize generative AI, making it accessible even to small companies, and reduce the energy consumption and carbon footprint associated with training large-scale AI models.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Marasi, Simone

Relatore della tesi

Moro, Gianluca

Correlatore della tesi

Frisoni, Giacomo

Scuola

Ingegneria e Architettura

Corso di studio