Experiences Using CPUs and GPUs for Cooperative Computation in a Multi-Physics Simulation
Author(s) -
Olga Pearce
Publication year - 2018
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/3229710.3229711
Subject(s) - porting , computer science , software portability , parallel computing , flops , supercomputer , scalability , symmetric multiprocessor system , multiphysics , computation , instruction set , node (physics) , cuda , multi core processor , operating system , software , programming language , physics , structural engineering , finite element method , engineering , thermodynamics
Top supercomputers in the TOP500 list have transitioned from homogeneous node architectures toward heterogeneous manycore nodes with accelerators and CPUs. These new architectures present significant challenges to developers of large-scale multiphysics applications, especially at DOE laboratories that have invested heavily in scalable MPI codes over decades. Much of these scientific application porting efforts for the new heterogeneous architectures are focused on running the computation on the accelerators, which usually comprise >90% of the FLOPS of the system. We describe an approach to utilizing the remaining FLOPS on a heterogeneous machine by running a portion of the computation on the CPUs cooperatively with the GPU computation. We present a proof-of-concept implementation in ARES, a multiphysics ALE-AMR code at LLNL. ARES uses a portability layer, RAJA, which enables us to utilize the same source code for both the CPU and the GPU. We develop an approach to utilize both types of processors cooperatively in a mixed-processor system. Our implementation divides the work between the computing resources via domain decomposition, and utilizes all cores of the CPU and all of the GPUs on the node for computation. Load balancing is necessary to use the heterogeneous resources effectively. We present preliminary results on early delivery pre-Sierra machines at LLNL, showing up to an 18% performance benefit of using the CPUs on the heterogeneous nodes for computing in addition to using the GPUs.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom