z-logo
open-access-imgOpen Access
Scientific data analysis on data-parallel platforms.
Author(s) -
Craig D. Ulmer,
Gregory W. Bayer,
Yung Ryn Choe,
Diana C. Roe
Publication year - 2010
Language(s) - English
Resource type - Reports
DOI - 10.2172/1011199
Subject(s) - terabyte , computer science , data warehouse , data science , process (computing) , informatics , database , function (biology) , data mining , operating system , engineering , evolutionary biology , electrical engineering , biology
As scientific computing users migrate to petaflop platforms that promise to generate multi-terabyte datasets, there is a growing need in the community to be able to embed sophisticated analysis algorithms in the computing platforms' storage systems. Data Warehouse Appliances (DWAs) are attractive for this work, due to their ability to store and process massive datasets efficiently. While DWAs have been utilized effectively in data-mining and informatics applications, they remain largely unproven in scientific workloads. In this paper we present our experiences in adapting two mesh analysis algorithms to function on five different DWA architectures: two Netezza database appliances, an XtremeData dbX database, a LexisNexis DAS, and multiple Hadoop MapReduce clusters. The main contribution of this work is insight into the differences between these DWAs from a user's perspective. In addition, we present performance measurements for ten DWA systems to help understand the impact of different architectural trade-offs in these systems

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here