z-logo
Premium
Runtime failure rate targeting for energy‐efficient reliability in chip microprocessors
Author(s) -
Miller Timothy,
Surapaneni Nagarjuna,
Teodorescu Radu
Publication year - 2012
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.2898
Subject(s) - computer science , embedded system , energy consumption , microprocessor , redundancy (engineering) , multi core processor , microarchitecture , chip , bottleneck , reliability engineering , parallel computing , engineering , telecommunications , electrical engineering , operating system
SUMMARY Technology scaling is having an increasingly detrimental effect on microprocessor reliability, with increased variability and higher susceptibility to errors. At the same time, as integration of chip multiprocessors increases, power consumption is becoming a significant bottleneck. To ensure continued performance, growth of microprocessors requires development of powerful and energy‐efficient solutions to reliability challenges. This paper presents a reliable multicore architecture that provides targeted error protection by adapting to the characteristics of individual cores and workloads, with the goal of providing reliability with minimum energy. The user can specify an acceptable reliability target for each chip, core, or application. The system then adjusts a range of parameters, including replication and supply voltage, to meet that reliability goal. In this multicore architecture, each core consists of a pair of pipelines that can run independently (running separate threads) or in concert (running the same thread and verifying results). Redundancy is enabled selectively, at functional unit granularity. The architecture also employs timing speculation for mitigation of variation‐induced timing errors and to reduce the power overhead of error protection. On‐line control based on machine learning dynamically adjusts multiple parameters to minimize energy consumption. Evaluation shows that dynamic adaptation of voltage and redundancy can reduce the energy delay product of a chip multiprocessor by 30 − 60% compared with static dual modular redundancy. Copyright © 2012 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom