Prepared scan
Author(s) -
Francisco Neves,
Ricardo Vilaça,
José Pereira,
Rui Oliveira
Publication year - 2017
Publication title -
portuguese national funding agency for science, research and technology (rcaap project by fct)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/3019612.3019863
Subject(s) - nosql , computer science , exploit , bandwidth (computing) , schema (genetic algorithms) , flexibility (engineering) , benchmark (surveying) , database , relational database , data mining , distributed computing , big data , computer network , information retrieval , statistics , computer security , mathematics , geodesy , geography
The ability of NoSQL systems to scale better than traditional relational databases motivates a large set of applications to migrate their data to NoSQL systems, even without aiming to exploit the provided schema flexibility. However, accessing structured data is costly due to such flexibility, incurring in a lot of bandwidth and processing unit usage. In this paper, we analyse this cost in Apache HBase and propose a new scan operation, named Prepared Scan, that optimizes the access to data structured in a regular manner by taking advantage of a well-known schema by application. Using an industry standard benchmark, we show that Prepared Scan improves throughput up to 29+ and decreases network bandwidth consumption up to 20+.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom