Performance and Scalability Study of FMM Kernels on Novel Multi- and Many-core Architectures
Author(s) -
Antón Rey,
Francisco D. Igual,
Manuel Prieto,
Jan F. Prins
Publication year - 2017
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2017.05.128
Subject(s) - computer science , xeon phi , scalability , parallel computing , kernel (algebra) , xeon , fast multipole method , multi core processor , computer architecture , granularity , many core , operating system , multipole expansion , physics , mathematics , combinatorics , quantum mechanics
We provide efficient implementations of common Fast Multipole Method (FMM) tasks for modern multi-core (Intel Xeon Haswell), many-core (Intel Xeon Phi Knights Landing) and Nvidia Pascal GPUs, offering optimization guidelines for each kernel and architecture, and exposing task granularity issues with evaluations on performance and scalability. These results motivate the use of hybrid execution models for FMM in heterogeneous architectures, in which per-kernel execution configurations are set by the kernel adaptability to the processor.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom