High-productivity Framework for Large-scale GPU/CPU Stencil Applications | Zendy

Takashi Shimokawabe | Zendy; Takayuki Aoki | Zendy; Naoyuki Onodera | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

High-productivity Framework for Large-scale GPU/CPU Stencil Applications

Author(s) -

Takashi Shimokawabe,

Takayuki Aoki,

Naoyuki Onodera

Publication year - 2016

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2016.05.499

Subject(s) - stencil , computer science , cuda , parallel computing , compiler , code (set theory) , computation , central processing unit , multi core processor , computational science , operating system , algorithm , programming language , set (abstract data type)

A high-productivity framework for multi-GPU and multi-CPU computation of stencil applications is proposed. Our framework is implemented in C++ and CUDA languages. It automatically translates user-written stencil functions that update a grid point and generates both GPU and CPU codes. The programmers write user code just in the C++ language, and can execute the translated user code on either multiple multicore CPUs or multiple GPUs with optimization. The user code can be executed on multiple GPUs with the auto-tuning mechanism and the overlapping method to hide communication cost by computation. It can be also executed on multiple CPUs with OpenMP. The compressible flow code on GPU exploiting the optimizations provided by the framework has achieved 2.7 times faster than the non-optimized version

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research