End-to-End Extraction and Injection of Graphs: A Self-Supervised Neuro-Symbolic Method for Explainable Large Language Models

Zecca, Andrea (2024) End-to-End Extraction and Injection of Graphs: A Self-Supervised Neuro-Symbolic Method for Explainable Large Language Models. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento ad accesso riservato.

Salva citazione

Documenti full-text disponibili:

Documento PDF (Thesis)
Full-text non accessibile fino al 28 Febbraio 2026.
Disponibile con Licenza: Creative Commons: Attribuzione - Non commerciale - Non opere derivate 4.0 (CC BY-NC-ND 4.0)
Download (6MB) | Contatta l'autore

Abstract

The increasing adoption of Large Language Models (LLMs) in diverse domains highlights the need to enhance their explainability, scalability, and adaptability to domain-specific challenges. This thesis introduces Graph In The Middle (GITM), a self-supervised neuro-symbolic framework for the end-to-end extraction and injection of graphs into LLMs. GITM leverages graph-based representations to improve the interpretability and reasoning capabilities of LLMs, particularly for Multiple Choice Question Answering (MCQA) tasks. The framework comprises three core components: the Graph Extractor, generating adjacency matrices for relational modeling; the Graph Encoder, transforming graph structures into embeddings; and the Answer Generator, integrating graph insights into LLMs. SparseMAP and LP-SparseMAP ensure efficient, interpretable structured predictions while maintaining end-to-end differentiability. Experimental evaluation on the CommonsenseQA dataset, enriched with ConceptNet-derived graphs, demonstrates the efficacy of GITM. Using Llama3.2 (1B and 3B), we achieved an average of 63% accuracy with frozen LLMs (+29.58 points over baseline) and 72.86% with trainable LLMs (+19.90 points over baseline). These results highlight the critical role of graph-based embeddings in enhancing reasoning and explainability. This research bridges neural and symbolic reasoning, offering a robust methodology to improve LLMs while addressing hallucinations, interpretability, and domain-specific challenges. It advances explainable AI with implications for transparency and trustworthiness in critical applications.

Abstract

Tipologia del documento

Tesi di laurea (Laurea magistrale)

Autore della tesi

Zecca, Andrea

Relatore della tesi

Moro, Gianluca

Correlatore della tesi

Molfetta, Lorenzo ; Frisoni, Giacomo

Scuola

Ingegneria e Architettura

Corso di studio