Italian Retrieval-Augmented Generative Question Answering System for Legal Domains

Busatta, Gianluca (2022) Italian Retrieval-Augmented Generative Question Answering System for Legal Domains. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)


A typical scenario involves a user searching an information about something and obtaining a list of documents from an information retrieval system. The retrieved documents may be more or less relevant and it could happen that the information sought is contained in several documents. This would possibly leave the task of searching the information in different documents to the user. In this thesis, it is has been developed an Italian question answering system for legal domains with a Retrieval-Augmented Generation (RAG) approach that aims to directly satisfy the information need of the user. The model is composed of a retriever and a generator both of which are based on Transformer and it has been trained firstly in a self-supervised way on the library of Gruppo Maggioli company, and then in a supervised way on a novel Italian question answering dataset build on purpose. Once the user has provided an input, the model automatically retrieves possibly relevant documents from the knowledge base and use them to condition the generation of an appropriate answer.

Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Busatta, Gianluca
Relatore della tesi
Correlatore della tesi
Corso di studio
Ordinamento Cds
Parole chiave
Generative question answering,retrieval augmented generation,information retrieval,transformer,legal domains
Data di discussione della Tesi
22 Marzo 2022

Altri metadati

Gestione del documento: Visualizza il documento