z-logo
open-access-imgOpen Access
Where do good query terms come from?
Author(s) -
Muresan Gheorghe,
Roussinov Dmitri
Publication year - 2006
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.14504301197
Subject(s) - computer science , query expansion , representation (politics) , information retrieval , relevance (law) , intuition , query optimization , set (abstract data type) , term (time) , web query classification , data mining , machine learning , web search query , search engine , philosophy , physics , epistemology , quantum mechanics , politics , political science , law , programming language
Abstract This paper describes a framework for investigating the quality of different query expansion approaches, and applies it in the HARD TREC experimental setting. The intuition behind our approach is that each topic has an optimal term‐based representation, i.e. a set of terms that best describe it, and that the effectiveness of any other representation is correlated with the overlap that it has with the optimal representation. Indeed, we find that, for a wide number of candidate topic representations, obtained through various query‐expansion approaches, there is a high correlation between standard effectiveness measures (R‐P, P@10, MAP) and term overlap with what is estimated to be the optimal representation. An important conclusion of comparing different query expansion approaches is that machines are better than humans at doing statistical calculations and at estimating which query terms are more likely to discriminate documents relevant for a given topic. This explains why, in the HARD track of TREC 2005, the overall conclusion was that interaction with the searcher and elicitation of additional information could not over‐perform automatic procedures for query improvement. However, the best results are obtained from hybrid approaches, in which human relevance judgments are used by algorithms for deriving terms representations. This result suggest that the best approach in improving retrieval performance is probably to focus on implicit relevance feedback and novel interaction models based on ostention or mediation, which have shown great potential.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here