Premium
AN INTERACTIVE SEARCH ASSISTANT ARCHITECTURE BASED ON INTRINSIC QUERY STREAM CHARACTERISTICS
Author(s) -
BarouniEbrahimi M.,
Ghorbani Ali A.
Publication year - 2008
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/j.1467-8640.2008.00326.x
Subject(s) - computer science , web search query , information retrieval , web query classification , ranking (information retrieval) , search engine , sargable , phrase , phrase search , metric (unit) , query expansion , data stream mining , data stream , query language , data mining , sequence (biology) , artificial intelligence , telecommunications , operations management , genetics , biology , economics
Search engine query log mining has evolved over time to more like data stream mining due to the endless and continuous sequence of queries known as query stream. In this paper, we propose an online frequent sequence discovery (OFSD) algorithm to extract frequent phrases from within query streams, based on a new frequency rate metric, which is suitable for query stream mining. OFSD is an online, single pass, and real‐time frequent sequence miner appropriate for data streams. The frequent phrases extracted by the OFSD algorithm are used to guide novice Web search engine users to complete their search queries more efficiently. YourEye, our online phrase recommender is then introduced. The advantages of YourEye compared with Google Suggest, a service powered by Google for phrase suggestion, is also described. Various characteristics of two specific Web search engine query logs are analyzed and then the query logs are used to evaluate YourEye. The experimental results confirm the significant benefit of monitoring frequent phrases within the queries instead of the whole queries because none‐separable items. The number of the monitored elements substantially decreases, which results in smaller memory consumption as well as better performance. Re‐ranking the retrieved pages based on past users clicks for each frequent phrase extracted by OFSD is also introduced. The preliminary results show the advantages of the proposed method compared to the similar work reported in Smyth et al.