Predictive caching and prefetching of query results in search engines
Author(s) -
Ronny Lempel,
Shlomo Moran
Publication year - 2003
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Conference proceedings
ISBN - 1-58113-680-3
DOI - 10.1145/775152.775156
Subject(s) - computer science , cache , search engine , trace (psycholinguistics) , probabilistic logic , false sharing , scheme (mathematics) , web search query , cpu cache , query expansion , database , response time , cache algorithms , information retrieval , parallel computing , operating system , artificial intelligence , mathematical analysis , philosophy , linguistics , mathematics
We study the caching of query result pages in Web search engines. Popular search engines receive millions of queries per day, and efficient policies for caching query results may enable them to lower their response time and reduce their hardware requirements. We present PDC (probability driven cache), a novel scheme tailored for caching search results, that is based on a probabilistic model of search engine users. We then use a trace of over seven million queries submitted to the search engine AltaVista to evaluate PDC, as well as traditional LRU and SLRU based caching schemes. The trace driven simulations show that PDC outperforms the other policies. We also examine the prefetching of search results, and demonstrate that prefetching can increase cache hit ratios by 50% for large caches, and can double the hit ratios of small caches. When integrating prefetching into PDC, we attain hit ratios of over 0.53.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom