Query rewriting in information retrieval: automatic context extraction from local user documents to improve query results

Lipani, Aldo (2012) Query rewriting in information retrieval: automatic context extraction from local user documents to improve query results. [Laurea magistrale], Università di Bologna, Corso di Studio in Ingegneria informatica [LM-DM270], Documento ad accesso riservato.
Documenti full-text disponibili:
[img] Documento PDF
Full-text non accessibile

Download (1MB) | Contatta l'autore

Abstract

The central objective of research in Information Retrieval (IR) is to discover new techniques to retrieve relevant information in order to satisfy an Information Need. The Information Need is satisfied when relevant information can be provided to the user. In IR, relevance is a fundamental concept which has changed over time, from popular to personal, i.e., what was considered relevant before was information for the whole population, but what is considered relevant now is specific information for each user. Hence, there is a need to connect the behavior of the system to the condition of a particular person and his social context; thereby an interdisciplinary sector called Human-Centered Computing was born. For the modern search engine, the information extracted for the individual user is crucial. According to the Personalized Search (PS), two different techniques are necessary to personalize a search: contextualization (interconnected conditions that occur in an activity), and individualization (characteristics that distinguish an individual). This movement of focus to the individual's need undermines the rigid linearity of the classical model overtaken the ``berry picking'' model which explains that the terms change thanks to the informational feedback received from the search activity introducing the concept of evolution of search terms. The development of Information Foraging theory, which observed the correlations between animal foraging and human information foraging, also contributed to this transformation through attempts to optimize the cost-benefit ratio. This thesis arose from the need to satisfy human individuality when searching for information, and it develops a synergistic collaboration between the frontiers of technological innovation and the recent advances in IR. The search method developed exploits what is relevant for the user by changing radically the way in which an Information Need is expressed, because now it is expressed through the generation of the query and its own context. As a matter of fact the method was born under the pretense to improve the quality of search by rewriting the query based on the contexts automatically generated from a local knowledge base. Furthermore, the idea of optimizing each IR system has led to develop it as a middleware of interaction between the user and the IR system. Thereby the system has just two possible actions: rewriting the query, and reordering the result. Equivalent actions to the approach was described from the PS that generally exploits information derived from analysis of user behavior, while the proposed approach exploits knowledge provided by the user. The thesis went further to generate a novel method for an assessment procedure, according to the "Cranfield paradigm", in order to evaluate this type of IR systems. The results achieved are interesting considering both the effectiveness achieved and the innovative approach undertaken together with the several applications inspired using a local knowledge base.

Abstract
Tipologia del documento
Tesi di laurea (Laurea magistrale)
Autore della tesi
Lipani, Aldo
Relatore della tesi
Correlatore della tesi
Scuola
Corso di studio
Ordinamento Cds
DM270
Parole chiave
Information Retrieval Personalized Search Relevance Feedback
Data di discussione della Tesi
19 Dicembre 2012
URI

Altri metadati

Gestione del documento: Visualizza il documento

^