Zangrillo, Francesco
(2024)
First-Order Logic Formulation and Injection into Large Language Models for Driven Question Answering.
[Laurea magistrale], Università di Bologna, Corso di Studio in
Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore.
(
Contatta l'autore)
Abstract
This project enhances question-answering (QA) capabilities by integrating First-Order Logic (FOL) into Large Language Models (LLMs) through an iterative refinement process that directly translates natural language into FOL. Unlike traditional models like SymbCoT and LogicLLaMA, which exhibit limitations in planning and reasoning coherence on complex tasks, this approach embeds FOL predicates and logical clauses into both the contexts and questions from the outset of model training. The method involves creating an augmented dataset based on the MS MARCO benchmark, fine-tuned with FOL enhancements to improve the logical reasoning capabilities of LLMs. The architecture performs iterative FOL translation directly on inputs, ensuring logical consistency throughout the reasoning process. It combines symbolic solvers like Prover9 with a semantic refinement mechanism using LogicLLaMA to ensure the syntactic and semantic accuracy of the generated logic structures. Evaluation results show significant improvements in both exact match and ROUGE scores across various inference techniques. The FOL fine-tuned model achieves an exact match score of 0.226 in zero-shot inference, marking a substantial enhancement of +16 points over the base model. ROUGE scores also see notable increases: ROUGE-1 is up by +17 points, ROUGE-2 by +10 points, and both ROUGE-L and ROUGE-Lsum by +19 points. These findings demonstrate that integrating FOL from the early stages of training enables LLMs to handle complex reasoning tasks with greater accuracy and coherence. This method reduces inconsistencies in logical conclusions while providing clearer, more transparent justifications for model decisions, setting a new standard for logic-enhanced question-answering systems.
Abstract
This project enhances question-answering (QA) capabilities by integrating First-Order Logic (FOL) into Large Language Models (LLMs) through an iterative refinement process that directly translates natural language into FOL. Unlike traditional models like SymbCoT and LogicLLaMA, which exhibit limitations in planning and reasoning coherence on complex tasks, this approach embeds FOL predicates and logical clauses into both the contexts and questions from the outset of model training. The method involves creating an augmented dataset based on the MS MARCO benchmark, fine-tuned with FOL enhancements to improve the logical reasoning capabilities of LLMs. The architecture performs iterative FOL translation directly on inputs, ensuring logical consistency throughout the reasoning process. It combines symbolic solvers like Prover9 with a semantic refinement mechanism using LogicLLaMA to ensure the syntactic and semantic accuracy of the generated logic structures. Evaluation results show significant improvements in both exact match and ROUGE scores across various inference techniques. The FOL fine-tuned model achieves an exact match score of 0.226 in zero-shot inference, marking a substantial enhancement of +16 points over the base model. ROUGE scores also see notable increases: ROUGE-1 is up by +17 points, ROUGE-2 by +10 points, and both ROUGE-L and ROUGE-Lsum by +19 points. These findings demonstrate that integrating FOL from the early stages of training enables LLMs to handle complex reasoning tasks with greater accuracy and coherence. This method reduces inconsistencies in logical conclusions while providing clearer, more transparent justifications for model decisions, setting a new standard for logic-enhanced question-answering systems.
Tipologia del documento
Tesi di laurea
(Laurea magistrale)
Autore della tesi
Zangrillo, Francesco
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Natural Language Processing,Knowledge Enhanced Question Answering,Logical Reasoning,First-Order Logic
Data di discussione della Tesi
8 Ottobre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Zangrillo, Francesco
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Natural Language Processing,Knowledge Enhanced Question Answering,Logical Reasoning,First-Order Logic
Data di discussione della Tesi
8 Ottobre 2024
URI
Gestione del documento: