z-logo
Premium
Exploiting the capabilities of modern GPUs for dense matrix computations
Author(s) -
Barrachina Sergio,
Castillo Maribel,
Igual Francisco D.,
Mayo Rafael,
QuintanaOrtí Enrique S.,
QuintanaOrtí Gregorio
Publication year - 2009
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.1472
Subject(s) - computer science , double precision floating point format , parallel computing , computation , padding , graphics , cuda , single precision floating point format , computational science , linear algebra , linear system , general purpose computing on graphics processing units , arbitrary precision arithmetic , algorithm , computer graphics (images) , mathematics , mathematical analysis , computer security , geometry
We present several algorithms to compute the solution of a linear system of equations on a graphics processor (GPU), as well as general techniques to improve their performance, such as padding and hybrid GPU‐CPU computation. We compare single and double precision performance of a modern GPU with unified architecture, and show how iterative refinement with mixed precision can be used to regain full accuracy in the solution of linear systems, exploiting the potential of the processor for single precision arithmetic. Experimental results on a GTX280 using CUBLAS 2.0, the implementation of BLAS for NVIDIA ® GPUs with unified architecture, illustrate the performance of the different algorithms and techniques proposed. Copyright © 2009 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here