Ramzan, Faisal
(2023)
Subgraph Retrieval for Biomedical Open-Domain Question Answering: Unlocking the Knowledge Graph Embedding Power.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270]
Documenti full-text disponibili:
|
Documento PDF (Thesis)
Disponibile con Licenza: Salvo eventuali più ampie autorizzazioni dell'autore, la tesi può essere liberamente consultata e può essere effettuato il salvataggio e la stampa di una copia per fini strettamente personali di studio, di ricerca e di insegnamento, con espresso divieto di qualunque utilizzo direttamente o indirettamente commerciale. Ogni altro diritto sul materiale è riservato
Download (5MB)
|
Abstract
Structured KG is more popular than KG; Language Models do not capture the semantic meaning of the same context with billions of parameters. While retrieving the entire Knowledge Graph is quite challenging concerning the size and memory issues. Moreover, inferring the answer to the question takes time during the reasoning on the whole KG, affecting the reasoning phase, which causes finding the incorrect solution.
Pre-trained LMs have broad knowledge coverage but must perform better on structured reasoning, such as handling negation and flipped conditions. We aim to retrieve the relevant portion of the subgraph from the large KG graph. The existing subgraph retrieval solutions primarily focus on discriminative k-hop approaches or SPARQL queries on massive KGs. However, they require time-consuming, unsustainable operations in real-world contexts like biomedicine, where entities and known relationships among them are massive. They frequently rely on Named entity-linking NEL tools that fail in recognizing and mapping entities without being capable of generalizing to similar or high-order concepts. Instead, approximated search on dense representations of KGs and text can significantly boost the effectiveness and efficiency of subgraph construction with the help of enhanced generalization capabilities that overcome NEL limits and the possibility of indexing embeddings and speeding up top-K retrieval operations. In our work, we analyzed the existing methods of subgraph construction. However, they could be more efficient because of their size and quality of retrieved subgraph, which affect the reasoning process for extracting an answer to the question. Therefore, we propose the Subgraph Retrieval that tries to find the more relevant entities through linked paths (path queries) to the topic entities. The goal is to find the sequence of relations and their connected entities linked to the topic entities by measuring their similarity between them in the dense space setting.
Abstract
Structured KG is more popular than KG; Language Models do not capture the semantic meaning of the same context with billions of parameters. While retrieving the entire Knowledge Graph is quite challenging concerning the size and memory issues. Moreover, inferring the answer to the question takes time during the reasoning on the whole KG, affecting the reasoning phase, which causes finding the incorrect solution.
Pre-trained LMs have broad knowledge coverage but must perform better on structured reasoning, such as handling negation and flipped conditions. We aim to retrieve the relevant portion of the subgraph from the large KG graph. The existing subgraph retrieval solutions primarily focus on discriminative k-hop approaches or SPARQL queries on massive KGs. However, they require time-consuming, unsustainable operations in real-world contexts like biomedicine, where entities and known relationships among them are massive. They frequently rely on Named entity-linking NEL tools that fail in recognizing and mapping entities without being capable of generalizing to similar or high-order concepts. Instead, approximated search on dense representations of KGs and text can significantly boost the effectiveness and efficiency of subgraph construction with the help of enhanced generalization capabilities that overcome NEL limits and the possibility of indexing embeddings and speeding up top-K retrieval operations. In our work, we analyzed the existing methods of subgraph construction. However, they could be more efficient because of their size and quality of retrieved subgraph, which affect the reasoning process for extracting an answer to the question. Therefore, we propose the Subgraph Retrieval that tries to find the more relevant entities through linked paths (path queries) to the topic entities. The goal is to find the sequence of relations and their connected entities linked to the topic entities by measuring their similarity between them in the dense space setting.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Ramzan, Faisal
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Open-domain Questioning Answering,Knowledge Graphs,Subgraph Retrieval,Multi-hop Reasoning
Data di discussione della Tesi
23 Marzo 2023
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Ramzan, Faisal
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Open-domain Questioning Answering,Knowledge Graphs,Subgraph Retrieval,Multi-hop Reasoning
Data di discussione della Tesi
23 Marzo 2023
URI
Statistica sui download
Gestione del documento: