z-logo
open-access-imgOpen Access
Optimizing relational algebra operations using generic equivalence discriminators and lazy products
Author(s) -
Fritz Henglein
Publication year - 2010
Publication title -
citeseer x (the pennsylvania state university)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/1706356.1706372
Subject(s) - joins , relational algebra , computer science , relational database , equivalence (formal languages) , conjunctive query , relational calculus , codd's theorem , relational model , theoretical computer science , product (mathematics) , filter (signal processing) , algebra over a field , programming language , information retrieval , mathematics , discrete mathematics , geometry , pure mathematics , computer vision
We show how to efficiently evaluate generic map-filter-product queries, generalizations of select-project-join (SPJ) queries in relational algebra, based on a combination of two novel techniques: generic discrimination-based joins and lazy (formal) products. Discrimination-based joins are based on the notion of (equivalence) discriminator. A discriminator partitions a list of values according to a user-specified equivalence relation on keys the values are associated with. Equivalence relations can be specified in an expressive embedded language for denoting equivalence relations. We show that discriminators can be constructed generically (by structural recursion on equivalence expressions), purely functionally, and efficiently (worst-case linear time). The array-based basic multiset discrimination algorithm of Cai and Paige (1995) provides a base discriminator that is both asymptotically and practically efficient. In contrast to hashing, discrimination is fully abstract (only depends on which equivalences hold on its inputs), and in contrast to comparison-based sorting, it does not require an ordering relation on its inputs. In particular, it is applicable to references (pointers). Furthermore, it has better asymptotic computational complexity than both sorting and hashing. We represent cross-products and unions lazily (symbolically) as formal products of the argument sets (relations). This allows the selection operation to recognize on the fly whenever it is applied to a cross-product and invoke an efficient equijoin implementation. In particular, queries can still be formulated naively, using filter, map and product without an explicit join operation, yet garner the advantages of efficient join-algorithms during evaluation. The techniques subsume many of the optimization techniques based on relational algebra equalities, without need for a query preprocessing phase. They require no indexes and behave purely functionally. They can be considered a form of symbolic execution of set expressions that automate and encapsulate dynamic program transformation of such expressions and lead to asymptotic performance improvements over naive execution in many cases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom