PSkel: A stencil programming framework for CPU‐GPU systems | Zendy

Pereira Alyson D. | Zendy; Ramos Luiz | Zendy; Góes Luís F. W. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

PSkel: A stencil programming framework for CPU‐GPU systems

Author(s) -

Pereira Alyson D.,

Ramos Luiz,

Góes Luís F. W.

Publication year - 2015

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.3479

Subject(s) - computer science , cuda , parallel computing , stencil , programmer , general purpose computing on graphics processing units , programming paradigm , central processing unit , graphics , operating system , computational science , programming language

Summary The use of Graphics Processing Units (GPUs) for high‐performance computing has gained growing momentum in recent years. Unfortunately, GPU‐programming platforms like Compute Unified Device Architecture (CUDA) are complex, user unfriendly, and increase the complexity of developing high‐performance parallel applications. In addition, runtime systems that execute those applications often fail to fully utilize the parallelism of modern CPU‐GPU systems. Typically, parallel kernels run entirely on the most powerful device available, leaving other devices idle. These observations sparked research in two directions: (1) high‐level approaches to software development for GPUs, which strike a balance between performance and ease of programming; and (2) task partitioning to fully utilize the available devices. In this paper, we propose a framework, called PSkel, that provides a single high‐level abstraction for stencil programming on heterogeneous CPU‐GPU systems, while allowing the programmer to partition and assign data and computation to both CPU and GPU. Our current implementation uses parallel skeletons to transparently leverage Intel Threading Building Blocks (Intel Corporation, Santa Clara, CA, USA) and NVIDIA CUDA (Nvidia Corporation, Santa Clara, CA, USA). In our experiments, we observed that parallel applications with task partitioning can improve average performance by up to 76% and 28% compared with CPU‐only and GPU‐only parallel applications, respectively. Copyright © 2015 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research