Documenti full-text disponibili:
Abstract
Recently, the utility and convenience of Large Language Models (LLMs), especially as generative chatbots, have gained significant attention in various contexts. Chatbots are designed with simplicity and intuitiveness, leveraging natural conversational interfaces that align with user habits. They can effectively handle a wide range of tasks, from everyday inquiries to complex professional applications. However, the current limitations of these chatbots often restrict their functionality, primarily in responding to general public information queries. This constraint diminishes their effectiveness, particularly in specialized fields that demand more granular and contextualized information.
This thesis proposes developing a prototype of a conversational agent specifically designed to meet the needs of the University of Bologna. The primary objective is to enable the chatbot to respond comprehensively to all types of questions a student might ask related to the university context. This goal will be achieved by integrating authentic information ensuring the chatbot can respond precisely and accurately to student inquiries. To facilitate this, the project will implement a Retrieval-Augmented Generation pipeline that utilizes a customized University of Bologna-specific information dataset, scraped and cleaned from the official university website. This innovative approach combines LLMs' advanced natural language understanding and response generation capabilities with targeted academic knowledge. Preliminary results indicate a significant enhancement in the chatbot's performance, with an improvement of approximately 25-30\% in its ability to respond to questions relevant to the university setting, compared to direct generation using the same language model. This allows the chatbot to provide detailed answers on topics like course structures and Erasmus programs.
Abstract
Recently, the utility and convenience of Large Language Models (LLMs), especially as generative chatbots, have gained significant attention in various contexts. Chatbots are designed with simplicity and intuitiveness, leveraging natural conversational interfaces that align with user habits. They can effectively handle a wide range of tasks, from everyday inquiries to complex professional applications. However, the current limitations of these chatbots often restrict their functionality, primarily in responding to general public information queries. This constraint diminishes their effectiveness, particularly in specialized fields that demand more granular and contextualized information.
This thesis proposes developing a prototype of a conversational agent specifically designed to meet the needs of the University of Bologna. The primary objective is to enable the chatbot to respond comprehensively to all types of questions a student might ask related to the university context. This goal will be achieved by integrating authentic information ensuring the chatbot can respond precisely and accurately to student inquiries. To facilitate this, the project will implement a Retrieval-Augmented Generation pipeline that utilizes a customized University of Bologna-specific information dataset, scraped and cleaned from the official university website. This innovative approach combines LLMs' advanced natural language understanding and response generation capabilities with targeted academic knowledge. Preliminary results indicate a significant enhancement in the chatbot's performance, with an improvement of approximately 25-30\% in its ability to respond to questions relevant to the university setting, compared to direct generation using the same language model. This allows the chatbot to provide detailed answers on topics like course structures and Erasmus programs.
Tipologia del documento
Tesi di laurea
(Laurea)
Autore della tesi
Amadori, Nicolas
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Web Scraping,Data Extraction,Generative Chatbot,Retrieval-Augmented Generation
Data di discussione della Tesi
28 Novembre 2024
URI
Altri metadati
Tipologia del documento
Tesi di laurea
(NON SPECIFICATO)
Autore della tesi
Amadori, Nicolas
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Large Language Models,Web Scraping,Data Extraction,Generative Chatbot,Retrieval-Augmented Generation
Data di discussione della Tesi
28 Novembre 2024
URI
Gestione del documento: