Premium
An improved join‐free snowflake schema for ETL and OLAP of data warehouse
Author(s) -
Jianmin Wang,
Wenbin Zhao,
Tongrang Fan,
Shilong Yang,
Hongwei Lv
Publication year - 2020
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.5519
Subject(s) - data warehouse , online analytical processing , computer science , nosql , database , materialized view , sql , code (set theory) , dimensional modeling , data mining , view , database design , scalability , set (abstract data type) , programming language
Summary The emergence of big data makes more and more enterprise change data management strategy, from simple data storage to OLAP query analysis; meanwhile, NoSQL‐based data warehouse receive more increasing attention than traditional SQL‐based database. By improving the JFSS model for ETL, this paper proposes the uniform distribution code (UDC), model identification code (MIC), standard dimension code (SDC), and attribute dimensional code (ADC); defines the data storage format of ; and identifies the extraction, transformation, and loading strategies of data warehouse. Several experiments are carried out to analyze single record and range record queries as typical OLAP based on Hadoop database (HBase). The results show the proposed scheme can provide lower overhead than the traditional SQL‐based database while facilitating the scope and flexibility of data warehouse services.