Cassano, Lorenzo
(2024)
VerifAi: Towards an Open-Source Scientific Generative Question-Answering System with Referenced and Verifiable Answers.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
|
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (639kB)
|
Abstract
This research investigates the effectiveness of transformer-based models in mitigating hallucinations within the biomedical domain, a crucial area in natural language processing (NLP). Hallucinations occur when language models generate unsupported or divergent information. Despite their capabilities, large language models (LLMs) are prone to such errors, impacting critical sectors like biomedicine. The study has two main objectives: exploring methods like Retrieval-Augmented Generation (RAG) and Retrieval-Augmented Fine- Tuning (RAFT) to reduce hallucinations, and developing techniques for detecting persistent hallucinations. These efforts aim to limit and identify hallucinations in transformer-based models. Additionally, the research introduces a biomedical RAG system to enhance response reliability, using fine-tuned LLMs with PubMed abstracts. This system outperforms the PubMed search engine and GPT-4 Turbo in referencing relevant abstracts. The study also presents a Verification Engine for an open-source scientific QA system, using models fine-tuned on the SciFact dataset. The DeBERTa model achieved an F1 score of 88%, outperforming other models on the HealthVer dataset. These findings advance NLP techniques, particularly in biomedicine, by improving the accuracy and reliability of transformer-based models.
Abstract
This research investigates the effectiveness of transformer-based models in mitigating hallucinations within the biomedical domain, a crucial area in natural language processing (NLP). Hallucinations occur when language models generate unsupported or divergent information. Despite their capabilities, large language models (LLMs) are prone to such errors, impacting critical sectors like biomedicine. The study has two main objectives: exploring methods like Retrieval-Augmented Generation (RAG) and Retrieval-Augmented Fine- Tuning (RAFT) to reduce hallucinations, and developing techniques for detecting persistent hallucinations. These efforts aim to limit and identify hallucinations in transformer-based models. Additionally, the research introduces a biomedical RAG system to enhance response reliability, using fine-tuned LLMs with PubMed abstracts. This system outperforms the PubMed search engine and GPT-4 Turbo in referencing relevant abstracts. The study also presents a Verification Engine for an open-source scientific QA system, using models fine-tuned on the SciFact dataset. The DeBERTa model achieved an F1 score of 88%, outperforming other models on the HealthVer dataset. These findings advance NLP techniques, particularly in biomedicine, by improving the accuracy and reliability of transformer-based models.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Cassano, Lorenzo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
NLP,LLM,Hallucinations,Mistral,RAG,PUBMED,QA
Data di discussione della Tesi
8 Ottobre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Cassano, Lorenzo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
NLP,LLM,Hallucinations,Mistral,RAG,PUBMED,QA
Data di discussione della Tesi
8 Ottobre 2024
URI
Statistica sui download
Gestione del documento: