z-logo
open-access-imgOpen Access
Ranking queries on uncertain data
Author(s) -
Hua Ming,
Jian Pei,
Xuemin Lin
Publication year - 2010
Publication title -
the vldb journal
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.653
H-Index - 90
eISSN - 0949-877X
pISSN - 1066-8888
ISBN - 1441993797
DOI - 10.1007/s00778-010-0196-4
Subject(s) - ranking (information retrieval) , uncertain data , rank (graph theory) , computer science , set (abstract data type) , probabilistic logic , data mining , ranking svm , information retrieval , mathematics , combinatorics , artificial intelligence , programming language
Uncertain data is inherent in a few important applications. It is far from trivial to extend ranking queries (also known as top-k queries), a popular type of queries on certain data, to uncertain data. In this paper, we cast ranking queries on uncertain data using three parameters: rank threshold k, probability threshold p, and answer set size threshold l. Systematically, we identify four types of ranking queries on uncertain data. First, a probability threshold top-k query computes the uncertain records taking a probability of at least p to be in the top-k list. Second, a top-(k, l) query returns the top-l uncertain records whose probabilities of being ranked among top-k are the largest. Third, the p-rank of an uncertain record is the smallest number k such that the record takes a probability of at least p to be ranked in the top-k list. A rank threshold top-k query retrieves the records whose p-ranks are at most k. Last, a top-(p, l) query returns the top-l uncertain records with the smallest p-ranks. To answer such ranking queries, we present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation-based algorithm. To answer top-(k, l) queries and top-(p, l) queries, we propose PRist+, a compact index. An efficient index construction algorithm and efficacious query answering methods are developed for PRist+. An empirical study using real and synthetic data sets verifies the effectiveness of the probabilistic ranking queries and the efficiency of our methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom