z-logo
open-access-imgOpen Access
Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines
Author(s) -
Pieter Ghysels,
Thomas J. Ashby,
Karl Meerbergen,
Wim Vanroose
Publication year - 2013
Publication title -
siam journal on scientific computing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.674
H-Index - 147
eISSN - 1095-7197
pISSN - 1064-8275
DOI - 10.1137/12086563x
Subject(s) - generalized minimal residual method , computer science , massively parallel , parallel computing , asynchronous communication , orthogonalization , bottleneck , flops , scalability , algorithm , residual , telecommunications , database , embedded system
In the generalized minimal residual method (GMRES), the global all-to-all communication required in each iteration for orthogonalization and normalization of the Krylov base vectors is becoming a performance bottleneck on massively parallel machines. Long latencies, system noise, and load imbalance cause these global reductions to become very costly global synchronizations. In this work, we propose the use of nonblocking or asynchronous global reductions to hide these global communication latencies by overlapping them with other communications and calculations. A pipelined variation of GMRES is presented in which the result of a global reduction is used only one or more iterations after the communication phase has started. This way, global synchronization is relaxed and scalability is much improved at the expense of some extra computations. The numerical instabilities that inevitably arise due to the typical monomial basis by powering the matrix are reduced and often annihilated by using Newton or Chebysh...

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom