Neural Self-Supervised Information Retrieval: An Efficient and Effective Solution in Large Document Corpora

Marino, Samuele (2023) Neural Self-Supervised Information Retrieval: An Efficient and Effective Solution in Large Document Corpora. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270], Documento full-text non disponibile
Il full-text non è disponibile per scelta dell'autore. (Contatta l'autore)

Abstract

This thesis delves into the transformative impact of Transformer models, such as BERT and GPT, within the realm of search engines. It underscores their prowess in enhancing natural language processing by capturing intricate linguistic relationships and context, leading to more precise and context-aware search outcomes. The study recognizes the challenges associated with computational resources and unlabeled data and proposes a novel approach, a self-supervised information retrieval system utilizing unlabeled data from GRUPPO MAGGIOLI. In addition to addressing these challenges, the research endeavors to enhance multilingual Transformer models, particularly for the Italian language, thereby contributing to the field of Italian information retrieval. Moreover, it sheds light on the necessity for specialized vector databases to efficiently handle the voluminous data generated by Transformer models. The vector database assumes a crucial role in ensuring rapid and accurate indexing, especially in the context of managing extensive datasets, as exemplified by the scenario at GRUPPO MAGGIOLI. This comprehensive exploration aims to provide insights into the adaptability, challenges, and potential impact of incorporating Transformer models into search engines.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Marino, Samuele
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Natural Language Processing,Semantic Similarity Search,Information Retrieval,Vector Database,Index
Data di discussione della Tesi
16 Dicembre 2023
URI

Altri metadati

Gestione del documento: Visualizza il documento

^