From Knowledge Extraction to Semantic Reasoning: A Comparative Analysis of LLMs

Pigliapoco, Francesco (2025) From Knowledge Extraction to Semantic Reasoning: A Comparative Analysis of LLMs. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

The modern era is characterized by a massive amount of unstructured data which is difficult to manage and process. Knowledge Graphs enable the organization of this information by representing it through nodes and relationships. An automatic knowledge extraction mechanism would be ideal to populate these graphs, transforming unstructured data into structured knowledge. Large Language Models are effective in extracting triples, but still have limitations in semantic grounding, hallucinations, and handling textual ambiguity. This thesis conducts a comparative analysis of two recent Large Language Models from the LLaMA series, evaluating their performance on knowledge extraction and semantic reasoning tasks. The study focuses on the ability to transform unstructured text into knowledge triples and to perform entity linking against the DBpedia ontology. For the extraction task, the models were fine-tuned on the WebNLG dataset. For the reasoning task, a semantic benchmark was designed in a zero-shot context: a multiple-choice entity linking task, in which models must map textual entities to the their correct DBpedia’s URIs extracted from a specific local reference knowledge graph or, in the absence of an exact match, select the option that indicates the lack of a correct answer. Results show that both models achieve high performance in triples extraction, confirming the effectiveness of smaller models for these tasks. The entity linking benchmark, however, highlighted different behaviors between the models: LLaMA 3.1 8B achieved greater accuracy in choosing the correct entity, while LLaMA 3.2 3B demonstrated more reliable reasoning when no correct options were available. This work introduces a key consideration for LLM selection: response accuracy does not implicitly guarantee reliable reasoning. Results emphasize that, while LLMs can achieve high accuracy in knowledge extraction, careful consideration must be given to their reasoning reliability.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Pigliapoco, Francesco

Relatore della tesi

Torroni, Paolo

Correlatore della tesi

Pappacoda, Gianmarco

Scuola

Ingegneria e Architettura

Corso di studio