Premium
Dynamic query scheduling in parallel data warehouses
Author(s) -
Märtens Holger,
Rahm Erhard,
Stöhr Thomas
Publication year - 2003
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.786
Subject(s) - computer science , data warehouse , bitmap , parallel database , scheduling (production processes) , load balancing (electrical power) , workload , star schema , skew , architecture , database , distributed computing , online analytical processing , materialized view , parallel computing , database design , view , operating system , grid , art , telecommunications , operations management , geometry , mathematics , database schema , economics , visual arts , computer vision
Abstract Parallel processing is a key to high performance in very large data warehouse applications that execute complex analytical queries on huge amounts of data. Although parallel database systems (PDBSs) have been studied extensively in the past decades, the specifics of load balancing in parallel data warehouses have not been addressed in detail. In this study, we investigate how the load balancing potential of a Shared Disk (SD) architecture can be utilized for data warehouse applications. We propose an integrated scheduling strategy that simultaneously considers both processors and disks, regarding not only the total workload on each resource but also the distribution of load over time. We evaluate the performance of the new method in a comprehensive simulation study and compare it to several other approaches. The analysis incorporates skew aspects and considers typical data warehouse features such as star schemas with large fact tables and bitmap indices. Copyright © 2003 John Wiley & Sons, Ltd.