Retrieving and incorporating external knowledge into compressed Large Language Models

Marasi, Simone (2023) Retrieving and incorporating external knowledge into compressed Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Large Language Models (LLMs) have recently become a crucial resource due to their ability to generate human-like text in almost any language, but ethical and practical concerns have arisen. These include data privacy, the accuracy of model predictions, and the ecological footprint of large-scale model training. Techniques such as Retrieval Augmented Generation (RAG) and finetuning empower pretrained language models to acquire specific knowledge on a given topic from a proprietary knowledge base trying to overcome these challenges. This can be useful for industrial purposes where data confidentiality and the absence of hallucinations in responses are crucial. In this thesis, we implemented efficient quantisation and compression techniques to make it possible to significantly reduce the number of parameters involved in loading or finetuning LLMs without substantially compromising performance. This reduction in parameter size translates into substantial gains in terms of resource efficiency, both in terms of time and hardware requirements. Furthermore, with compression techniques, we finetuned models with 13 billion parameters like LLaMA-2, even on GPU with just 15 GB of memory. We also integrated RAG techniques, which represent a noteworthy stride forward since they enable models to be enriched directly with external knowledge by way of vector databases. These databases can efficiently embed different sources in the form of texts and knowledge graphs, exploiting their intrinsic structural strengths. By leveraging RAG and incorporating external knowledge, we demonstrated substantial enhancements in retrieval result quality, promising more contextually relevant and accurate responses, and compared different retrieval techniques involving knowledge graphs. These advancements democratize generative AI, making it accessible even to small companies, and reduce the energy consumption and carbon footprint associated with training large-scale AI models.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Marasi, Simone
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Retrieval Augmented Generation,Large Language Models,Model finetuning,Vector databases,Knowledge Graphs
Data di discussione della Tesi
21 Ottobre 2023
URI

Altri metadati

Gestione del documento: Visualizza il documento

^