Correcting soft errors online in fast fourier transform
Author(s) -
Xin Liang,
Jieyang Chen,
Dingwen Tao,
Sihuan Li,
Panruo Wu,
Hongbo Li,
Kaiming Ouyang,
Yuanlai Liu,
Fengguang Song,
Zizhong Chen
Publication year - 2017
Publication title -
purdue university indianapolis (indiana university)
Language(s) - English
Resource type - Conference proceedings
ISBN - 978-1-4503-5114-0
DOI - 10.1145/3126908.3126915
Subject(s) - fast fourier transform , computer science , overhead (engineering) , computation , scheme (mathematics) , fault tolerance , soft error , computer engineering , implementation , fault (geology) , parallel computing , algorithm , distributed computing , electronic engineering , mathematics , engineering , programming language , geology , mathematical analysis , seismology , operating system
While many algorithm-based fault tolerance (ABFT) schemes have been proposed to detect soft errors offline in the fast Fourier transform (FFT) after computation finishes, none of the existing ABFT schemes detect soft errors online before the computation finishes. This paper presents an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner. We also extend our scheme to tolerate both arithmetic errors and memory errors, develop strategies to reduce its fault tolerance overhead and improve its numerical stability and fault coverage, and finally incorporate it into the widely used FFTW library - one of the today's fastest FFT software implementations. Experimental results demonstrate that: (1) the proposed online ABFT scheme introduces much lower overhead than the existing offline ABFT schemes; (2) it detects errors in a much more timely manner; and (3) it also has higher numerical stability and better fault coverage.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom