Premium
Achieving Portable High Performance for Iterative Solvers on Accelerators
Author(s) -
Rupp Karl,
Tillet Philippe,
Jüngel Ansgar,
Grasser Tibor
Publication year - 2014
Publication title -
pamm
Language(s) - English
Resource type - Journals
ISSN - 1617-7061
DOI - 10.1002/pamm.201410462
Subject(s) - computer science , parallel computing , kernel (algebra) , conjugate gradient method , graphics , residual , vendor , implementation , computational science , minification , computer engineering , algorithm , computer graphics (images) , mathematics , programming language , combinatorics , marketing , business
We propose performance enhancements for the implementation of the conjugate gradient method and the generalized minimum residual method for accelerators such as graphics processing units. Through a minimization of memory transfers from global memory via pipelining as well as a reduction of the number of compute kernels through kernel fusion, the performance is improved by up to two‐fold when compared to standard implementations based on vendor‐tuned routines. (© 2014 Wiley‐VCH Verlag GmbH & Co. KGaA, Weinheim)