Early user---system interaction for database selection in massive domain-specific online environments
Author(s) -
Jack G. Conrad,
Joanne R. S. Claussen
Publication year - 2003
Publication title -
acm transactions on office information systems
Language(s) - English
Resource type - Journals
eISSN - 1558-1152
pISSN - 0734-2047
DOI - 10.1145/635484.635488
Subject(s) - computer science , selection (genetic algorithm) , information retrieval , dialog box , categorization , process (computing) , task (project management) , precision and recall , domain (mathematical analysis) , database , data mining , world wide web , machine learning , artificial intelligence , management , economics , operating system , mathematical analysis , mathematics
The continued growth of very large data environments such as Westlaw and Dialog, in addition to the World Wide Web, increases the importance of effective and efficient database selection and searching. Current research focuses largely on completely autonomous and automatic selection, searching, and results merging in distributed environments. This fully automatic approach has significant deficiencies, including reliance upon thresholds below which databases with relevant documents are not searched (compromised recall). It also merges documents, often from disparate data sources that users may have discarded before their source selection task proceeded (diluted precision). We examine the impact that early user interaction can have on the process of database selection. After analyzing thousands of real user queries, we show that precision can be significantly increased when queries are categorized by the users themselves, then handled effectively by the system. Such query categorization strategies may eliminate limitations of fully automated query processing approaches. Our system harnesses the WIN search engine, a sibling to INQUERY, run against one or more authority sources when search is required. We compare our approach to one that does not recognize or utilize distinct features associated with user queries. We show that by avoiding a one-size-fits-all approach that restricts the role users can play in information discovery, database selection effectiveness can be appreciably improved.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom