To Generate or to Retrieve: On the Effectiveness of Artificial Contexts for Biomedical Question Answering

Presepi, Alex (2023) To Generate or to Retrieve: On the Effectiveness of Artificial Contexts for Biomedical Question Answering. [Laurea], Università di Bologna, Corso di Studio in Ingegneria e scienze informatiche [L-DM270] - Cesena

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (7MB)

Abstract

Large Language Models (LLMs) nowadays are used to solve more tasks, focusing on knowledge-intensive tasks like open-domain question answering (ODQA). Most state-of-the-art solutions rely on retrieve-then-read pipelines that first retrieve relevant documents from external sources and then generate a response by augmenting the language model context. This methodology is based on embeddings and their indexing in vector databases but has several limitations. Generate-then-read (GenRead) replaces the retrieval component with LLM generators. This approach has recently exceeded previous retrieve-then-read solutions and pure generation without augmented context in tasks such as domain-general ODQA, fact-checking, and dialogue systems. In the biomedical field, these contributions become exceptionally significant due to the specialized terminology, the abundance of entities, the rapid advancement of scientific knowledge, the intolerance of hallucinations, and the necessity for evidence to verify the generated inferences. This thesis explores the generate-then-read paradigm in the biomedical domain using open-source LLM and with a limited number of parameters. Ensembling solutions are implemented for document generation, using different LLMs trained on different datasets. A pipeline is proposed in which k questions related to the context of the query are initially generated, followed by the generation of answers by an LLM. We also examine and compare different strategies for the document reading and response phase using chatbots, Fusion-in-Decoder (FiD), and encoder-only models. Using MedMCQA as the multiple-choice question dataset, medllama2 as the direct document generator, and LLaMA 2 as the chatbot to formulate the answer, an accuracy of 39.1% is achieved, compared to 36.8% for medllama2 and 35.2% for LLaMA 2.

Abstract

Tipologia del documento

Tesi di laurea (Laurea)

Autore della tesi

Presepi, Alex

Relatore della tesi

Moro, Gianluca

Correlatore della tesi

Frisoni, Giacomo

Scuola

Ingegneria e Architettura

Corso di studio