From text to knowledge: Large Language Models-based methods for knowledge extraction

Pappacoda, Gianmarco (2023) From text to knowledge: Large Language Models-based methods for knowledge extraction. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Most human knowledge is stored within text, in an unstructured form that is hardly accessible for machines. Historically, most of the efforts have been put in the direction of extracting information through Information Retrieval and Information Extraction methods. While these methods have allowed for very precise queries over sets of documents, most of the knowledge stored in texts remains untapped. The process of extracting knowledge from text is known as Knowledge Extraction. In this dissertation we explore the field of Knowledge Extraction through the lens of computational linguistics by leveraging Language Models. The primary objective of this dissertation is to understand whether (Large) Language Models can effectively extract knowledge from text and the interaction between Language Models and methods to represent knowledge such as Knowledge Graphs. Firstly, we provide a comprehensive review of the literature surrounding the field and briefly discuss related tasks. We also discuss Knowledge Bases and their representations, with a particular focus on the challenges posed by them. In the second part we describe our approach to Knowledge Extraction using Large Language Models, our experimental setup, along with evaluation methods to be used to evaluate the models in this particular task. We then present the results of our work, showing that fine-tuned LLMs can be effectively used for extracting knowledge from text. We provide a baseline for further experimentation and compare the models on popular benchmarks. Lastly, we briefly discuss an important side objective of this dissertation: whether the interaction with more refined forms of Knowledge enhances Language Models reasoning capabilities.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Pappacoda, Gianmarco
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Knowledge Graph,Knowledge Extraction,Text-to-Graph,Natural Language Processing,Natural Language Understanding
Data di discussione della Tesi
21 Ottobre 2023
URI

Altri metadati

Gestione del documento: Visualizza il documento

^