Monaldini, Nicolo
(2024)
Large Action Models: End-to-End Retrieval-Enhanced Learning for Generating Function Calls from Instruction Manuals.
[Laurea], Università di Bologna, Corso di Studio in
Ingegneria e scienze informatiche [L-DM270] - Cesena, Documento ad accesso riservato.
Documenti full-text disponibili:
Abstract
Large action models are agents that leverage the reasoning abilities of large language models (LLMs) to make decisions in real-life scenarios. Existing approaches often fine-tune LLMs for specific instruction-function mappings, which leads to limited generalizability and eventual obsolescence. LLMs that support function calling natively require a list of tools to be passed to the model, usually in JSON format. However, their context length limits the number of supported tools. Additionally, leveraging this functionality with pay-as-you-go closed-source models can result in high inference costs. Moreover, even cutting-edge models can hallucinate or make errors in tool selection when presented with many options. This work evaluates how retrieval-augmented generation could enhance generalization in tool selection and function-calling tasks. Specifically, we treat the LLM as a frozen black box and augment it with a tunable retriever module, trained to find the documentation chunks that maximize the LLM accuracy in tool selection. Our retriever preselects relevant function headers for a given query to mitigate context length restrictions and hallucination phenomena of the LLM to which it is plugged. We conducted extensive evaluations with various models, datasets, and loss functions. For functions already seen during training, using RoBERTa-base, we observed significant performance improvements: Rank@1 increased from 22.5% to 85.0%, Rank@2 from 27.5% to 100.0%, and Rank@3 from 30.0% to 100.0%. For functions not seen during training, Rank@1 improved from 40.0% to 75.0%, Rank@2 from 50.0% to 97.5%, and Rank@3 from 50.0% to 97.5%.
Abstract
Large action models are agents that leverage the reasoning abilities of large language models (LLMs) to make decisions in real-life scenarios. Existing approaches often fine-tune LLMs for specific instruction-function mappings, which leads to limited generalizability and eventual obsolescence. LLMs that support function calling natively require a list of tools to be passed to the model, usually in JSON format. However, their context length limits the number of supported tools. Additionally, leveraging this functionality with pay-as-you-go closed-source models can result in high inference costs. Moreover, even cutting-edge models can hallucinate or make errors in tool selection when presented with many options. This work evaluates how retrieval-augmented generation could enhance generalization in tool selection and function-calling tasks. Specifically, we treat the LLM as a frozen black box and augment it with a tunable retriever module, trained to find the documentation chunks that maximize the LLM accuracy in tool selection. Our retriever preselects relevant function headers for a given query to mitigate context length restrictions and hallucination phenomena of the LLM to which it is plugged. We conducted extensive evaluations with various models, datasets, and loss functions. For functions already seen during training, using RoBERTa-base, we observed significant performance improvements: Rank@1 increased from 22.5% to 85.0%, Rank@2 from 27.5% to 100.0%, and Rank@3 from 30.0% to 100.0%. For functions not seen during training, Rank@1 improved from 40.0% to 75.0%, Rank@2 from 50.0% to 97.5%, and Rank@3 from 50.0% to 97.5%.
Tipologia del documento
Tesi di laurea
(Laurea)
Autore della tesi
Monaldini, Nicolo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Action Models,Agents,End-to-End Retrieval-Augmented Generation,Function Calling,Natural Language Processing
Data di discussione della Tesi
18 Luglio 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Monaldini, Nicolo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Action Models,Agents,End-to-End Retrieval-Augmented Generation,Function Calling,Natural Language Processing
Data di discussione della Tesi
18 Luglio 2024
URI
Gestione del documento: