From text to knowledge: Large Language Models-based methods for knowledge extraction

Pappacoda, Gianmarco (2023) From text to knowledge: Large Language Models-based methods for knowledge extraction. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile

Salva citazione

Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

Most human knowledge is stored within text, in an unstructured form that is hardly accessible for machines. Historically, most of the efforts have been put in the direction of extracting information through Information Retrieval and Information Extraction methods. While these methods have allowed for very precise queries over sets of documents, most of the knowledge stored in texts remains untapped. The process of extracting knowledge from text is known as Knowledge Extraction. In this dissertation we explore the field of Knowledge Extraction through the lens of computational linguistics by leveraging Language Models. The primary objective of this dissertation is to understand whether (Large) Language Models can effectively extract knowledge from text and the interaction between Language Models and methods to represent knowledge such as Knowledge Graphs. Firstly, we provide a comprehensive review of the literature surrounding the field and briefly discuss related tasks. We also discuss Knowledge Bases and their representations, with a particular focus on the challenges posed by them. In the second part we describe our approach to Knowledge Extraction using Large Language Models, our experimental setup, along with evaluation methods to be used to evaluate the models in this particular task. We then present the results of our work, showing that fine-tuned LLMs can be effectively used for extracting knowledge from text. We provide a baseline for further experimentation and compare the models on popular benchmarks. Lastly, we briefly discuss an important side objective of this dissertation: whether the interaction with more refined forms of Knowledge enhances Language Models reasoning capabilities.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Pappacoda, Gianmarco

Relatore della tesi

Torroni, Paolo

Correlatore della tesi

Ruggeri, Federico

Scuola

Ingegneria e Architettura

Corso di studio