
Towards a Low Cost ETL System
Author(s) -
Vasco Santos,
Rui Silva,
Orlando Belo
Publication year - 2014
Publication title -
international journal of database management systems
Language(s) - English
Resource type - Journals
eISSN - 0975-5985
pISSN - 0975-5705
DOI - 10.5121/ijdms.2014.6205
Subject(s) - computer science
Data Warehouses store integrated and consistent data in a subject-oriented data repository dedicated especially to support business intelligence processes. However, keeping these repositories updated usually involves complex and time-consuming processes, commonly denominated as Extract-Transform- Load tasks. These data intensive tasks normally execute in a limited time window and their computational requirements tend to grow in time as more data is dealt with. Therefore, we believe that a grid environment could suit rather well as support for the backbone of the technical infrastructure with the clear financial advantage of using already acquired desktop computers normally present in the organization. This article proposes a different approach to deal with the distribution of ETL processes in a grid environment, taking into account not only the processing performance of its nodes but also the existing bandwidth to estimate the grid availability in a near future and therefore optimize workflow distribution