ligDB - Online Query Processing Without (almost) any Storage
Author(s) -
Evica Milchevski,
Sebastian Michel
Publication year - 2015
Language(s) - English
DOI - 10.5441/002/edbt.2015.69
In the big-data era data is arriving at such a high pace and volume that data exploration and querying can only be feasible if data loading and indexing happens reasonably quick—if at all. Recent research on handling large scien- tific data suggests ignoring any database indexing or even data-loading processing steps but rather turns toward pro- cessing raw data as it is handed in by scientists, manually or by semi-automated means—if needed in multiple, itera- tive steps. In this paper, we describe the anatomy and re- search challenges of a system coined ligDB 1 that is operat- ing purely on incomplete database tables, JSON documents, or sets of SPO triplets that are being filled over time. There is no data stored per se; the only data stored is stemming from previously posed queries over the stream of arriving data; kept as long as it is used by forthcoming queries and otherwise evicted. A key point is that velocity dimension of "big data"allows queries being processed as they are posted, with higher-level queries processed on historic query results (views) and live data. Data that is not touched by any posted query is immediately discarded.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom