Generating optimal CUDA sparse matrix–vector product implementations for evolving GPU hardware | Zendy

El Zein Ahmed H. | Zendy; Rendell Alistair P. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Generating optimal CUDA sparse matrix–vector product implementations for evolving GPU hardware

Author(s) -

El Zein Ahmed H.,

Rendell Alistair P.

Publication year - 2012

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.1732

Subject(s) - cuda , computer science , programmer , implementation , parallel computing , graphics , sparse matrix , process (computing) , general purpose computing on graphics processing units , computer architecture , computer graphics (images) , embedded system , programming language , physics , quantum mechanics , gaussian

SUMMARY The CUDA model for graphics processing units (GPUs) presents the programmer with a plethora of different programming options. These includes different memory types, different memory access methods and different data types. Identifying which options to use and when is a non‐trivial exercise. This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix–vector products (SpMV) across three different generations of NVIDIA GPU hardware. A process for analysing performance and selecting the subset of implementations that perform best is proposed. The potential for mapping sparse matrix attributes to optimal CUDA SpMV implementations is discussed. Copyright © 2011 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore