z-logo
open-access-imgOpen Access
A novel approach to data deduplication over the engineering-oriented cloud systems
Author(s) -
Zhe Sun,
Jun Shen,
Jianming Yong
Publication year - 2013
Publication title -
integrated computer-aided engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.801
H-Index - 42
eISSN - 1875-8835
pISSN - 1069-2509
DOI - 10.3233/ica-120418
Subject(s) - data deduplication , cloud computing , computer science , database , distributed computing , software engineering , data science , operating system
This paper presents a duplication-less storage system over the engineering-oriented cloud computing platforms. Our deduplication storage system, which manages data and duplication over the cloud system, consists of two major components, a front-end deduplication application and a mass storage system as back-end. Hadoop distributed file system HDFS is a common distribution file system on the cloud, which is used with Hadoop database HBase. We use HDFS to build up a mass storage system and employ HBase to build up a fast indexing system. With a deduplication application, a scalable and parallel deduplicated cloud storage system can be effectively built up. We further use VMware to generate a simulated cloud environment. The simulation results demonstrate that our deduplication storage system is sufficiently accurate and efficient for distributed and cooperative data intensive engineering applications.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom