z-logo
open-access-imgOpen Access
Optimal algorithms for selecting top-k combinations of attributes: theory and applications
Author(s) -
Chunbin Lin,
Jiaheng Lu,
Zhewei Wei,
Jianguo Wang,
Xiaokui Xiao
Publication year - 2017
Publication title -
the vldb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.653
H-Index - 90
eISSN - 0949-877X
pISSN - 1066-8888
DOI - 10.1007/s00778-017-0485-2
Subject(s) - computer science , scalability , extension (predicate logic) , algorithm , online aggregation , information retrieval , data mining , theoretical computer science , sargable , search engine , database , web search query , programming language
Traditional top- algorithms, e.g., TA and NRA, have been successfully applied in many areas such as information retrieval, data mining and databases. They are designed to discover objects, e.g., top- restaurants, with highest overall scores aggregated from different attributes, e.g., price and location. However, new emerging applications like query recommendation require providing the best combinations of , instead of objects. The straightforward extension based on the existing top- algorithms is prohibitively expensive to answer top- combinations because they need to enumerate all the possible combinations, which is exponential to the number of attributes. In this article, we formalize a novel type of top- query, called top-, , which aims to find top- combinations of attributes based on the overall scores of the top- objects within each combination, where is the number of objects forming a combination. We propose a family of efficient top-,  algorithms with different data access methods, i.e., sorted accesses and random accesses and different query certainties, i.e., exact query processing and approximate query processing. Theoretically, we prove that our algorithms are instance optimal and analyze the bound of the depth of accesses. We further develop optimizations for efficient query evaluation to reduce the computational and the memory costs and the number of accesses. We provide a case study on the real applications of top-,  queries for an online biomedical search engine. Finally, we perform comprehensive experiments to demonstrate the scalability and efficiency of top-,  algorithms on multiple real-life datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom