Random-forest-based failure prediction for hard disk drives
Author(s) -
Shen Jing,
Jian Wan,
SeJung Lim,
Lifeng Yu
Publication year - 2018
Publication title -
international journal of distributed sensor networks
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.324
H-Index - 53
eISSN - 1550-1477
pISSN - 1550-1329
DOI - 10.1177/1550147718806480
Subject(s) - computer science , reliability (semiconductor) , random forest , workload , mechanism (biology) , scale (ratio) , data mining , reliability engineering , machine learning , power (physics) , philosophy , physics , epistemology , quantum mechanics , engineering , operating system
Failure prediction for hard disk drives is a typical and effective approach to improve the reliability of storage systems. In a large-scale data center environment, the various brands and models of drives serve diverse applications with different input/output workload patterns, and non-ignorable differences exist in each type of drive failures, which make this mechanism much challenging. Although many efforts are devoted to this mechanism, the accuracy still needs to be improved. In this article, we propose a failure prediction method for hard disk drives based on a part-voting random forest, which differentiates prediction of failures in a coarse-grained manner. We conduct groups of validation experiments on two real-world datasets, which contain the SMART data of 64,193 drives. The experimental results show that our proposed method can achieve a better prediction accuracy than state-of-the-art methods.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom