STRUMENTI DI NAVIGAZIONE

Subgraph Retrieval for Biomedical Open-Domain Question Answering: Unlocking the Knowledge Graph Embedding Power

Ramzan, Faisal (2023) Subgraph Retrieval for Biomedical Open-Domain Question Answering: Unlocking the Knowledge Graph Embedding Power. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270]

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (5MB)

Abstract

Structured KG is more popular than KG; Language Models do not capture the semantic meaning of the same context with billions of parameters. While retrieving the entire Knowledge Graph is quite challenging concerning the size and memory issues. Moreover, inferring the answer to the question takes time during the reasoning on the whole KG, affecting the reasoning phase, which causes finding the incorrect solution. Pre-trained LMs have broad knowledge coverage but must perform better on structured reasoning, such as handling negation and flipped conditions. We aim to retrieve the relevant portion of the subgraph from the large KG graph. The existing subgraph retrieval solutions primarily focus on discriminative k-hop approaches or SPARQL queries on massive KGs. However, they require time-consuming, unsustainable operations in real-world contexts like biomedicine, where entities and known relationships among them are massive. They frequently rely on Named entity-linking NEL tools that fail in recognizing and mapping entities without being capable of generalizing to similar or high-order concepts. Instead, approximated search on dense representations of KGs and text can significantly boost the effectiveness and efficiency of subgraph construction with the help of enhanced generalization capabilities that overcome NEL limits and the possibility of indexing embeddings and speeding up top-K retrieval operations. In our work, we analyzed the existing methods of subgraph construction. However, they could be more efficient because of their size and quality of retrieved subgraph, which affect the reasoning process for extracting an answer to the question. Therefore, we propose the Subgraph Retrieval that tries to find the more relevant entities through linked paths (path queries) to the topic entities. The goal is to find the sequence of relations and their connected entities linked to the topic entities by measuring their similarity between them in the dense space setting.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Ramzan, Faisal

Relatore della tesi

Sartori, Claudio

Correlatore della tesi

Moro, Gianluca ; Frisoni, Giacomo

Scuola

Ingegneria e Architettura

Corso di studio