z-logo
open-access-imgOpen Access
A New Adaptive Coding Selection Method for Distributed Storage Systems
Author(s) -
Bing Wei,
Li-Min Xiao,
Wei Wei,
Yao Song,
Bing-Yu Zhou
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2801265
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Erasure codes, such as Reed–Solomon (RS) codes and local reconstruction codes (LRCs), are being increasingly adopted in distributed storage systems since they offer lower redundancy than data replication. While these codes significantly save storage space, they can incur large I/O overhead and network traffic in reconstructing unavailable data. Most existing storage systems use replication for hot data and an erasure code for warm and cold data, thereby achieving a good tradeoff between storage overhead and recovery performance. However, these storage systems do not take the access characteristics of data into account and tend to use only an erasure code, which hinders the possibility of reducing storage overhead and recovery cost. In this paper, we propose a new adaptive coding selection method that instead uses multiple LRCs for warm data. The LRCs are selected based on the access characteristics of the data. Each time a file is accessed, we assume that each of the involved data blocks is unavailable, in turn. It is necessary to calculate the I/O cost to recover unavailable blocks for different LRCs. The sum of the I/O costs for each LRC is calculated, and the LRC with the minimal I/O cost is selected for warm data. For cold data, we use an RS code that is optimized for storage overhead to reduce the storage burden. Our method is implemented on the top of the Hadoop distributed file system. Evaluations show that it reduces the storage overhead by up to 5% and the reconstruction traffic by up to 22%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom