Portability with efficiency of the advection of BRAMS between multi‐core and many‐core architectures | Zendy

Silva Junior Manoel Baptista | Zendy; Panetta Jairo | Zendy; Stephany Stephan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Portability with efficiency of the advection of BRAMS between multi‐core and many‐core architectures

Author(s) -

Silva Junior Manoel Baptista,

Panetta Jairo,

Stephany Stephan

Publication year - 2016

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.3959

Subject(s) - computer science , software portability , multi core processor , graphics , parallel computing , interface (matter) , programming paradigm , grid , posix threads , architecture , parallel programming model , supercomputer , distributed computing , operating system , programming language , thread (computing) , art , geometry , mathematics , bubble , maximum bubble pressure method , visual arts

Summary The continuous growth of spatial resolution and forecasting period in current atmospheric models demands increasing processing power supplied by supercomputers with hundreds or thousands of nodes. Currently, most of these models are operationally executed on supercomputers composed of nodes with tens of cores (multi‐core architecture). Newer supercomputer generations have nodes with multi‐core processors coupled to processing accelerators, typically graphics cards with hundreds of cores (many‐core architecture). The rewriting of model codes to use both architectures efficiently, that is, executing with or without graphics cards, represents a challenge because these models have hundreds of thousands of lines. The OpenMP programming interface proposed decades ago is a de facto standard that efficiently explores multi‐core architectures. A new programming interface, OpenACC, is being proposed for many‐core architectures. These two programming interfaces are similar, because they are based on parallelization directives for the concurrent execution of threads. This work shows the feasibility of writing a single portable code embedding both interfaces and presenting acceptable efficiency when executed on nodes with multi‐core or many‐core architecture. The code chosen as a case study is the advection of scalars, a part of the dynamics of the regional atmospheric model Brazilian Regional Atmospheric Modeling System (BRAMS). The dynamics of a model is harder to parallelize because of data dependencies between adjacent grid points. Single‐node executions of the advections of scalars for different grid sizes using OpenMP or OpenACC yielded similar speed‐ups, showing the feasibility of the proposed approach. Copyright © 2016 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research