Contention-free Routing for Shift-based Communication in MPI Applications on Large-scale Infiniband Clusters
Author(s) -
Adam Moody
Publication year - 2009
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Reports
DOI - 10.2172/967277
Subject(s) - computer science , node (physics) , message passing , computer network , infiniband , set (abstract data type) , asynchronous communication , distributed computing , parallel computing , structural engineering , engineering , programming language
Shift-based communication can be defined as follows. For a set of N nodes assigned to a job, assign each node an ID from 0 to N - 1. Shift-based communication involves N - 1 steps. Let D denote the current step, and let D iterate from 1 to N - 1. Then in step D all nodes choose nodes to send to and receive from such that a node with ID = I sends to the node with ID = (I + D)%N and receives from the node with ID = (I - D + N)%N, where % denotes modulo division. Figure 1 illustrates the communication patterns for various steps in shift-based communication. Shift-based communication patterns enable all nodes in a job to send and receive data simultaneously in all steps. Many MPI operations employ shift-based communication patterns for this reason. This includes large message collective algorithms, such as those typically used to implement large message Allgather and Alltoall collectives. It also includes small message collective algorithms, such as pair-wise exchange, barrier dissemination, and Bruck's index algorithm. Although the small message algorithms typically pack messages such that each node does not send to or receive from every other node directly,more » the communication patterns they do execute correspond to particular steps in shift-based communication. Common point-to-point message patterns also can benefit from efficient shift-based routing, such as nearest-neighbor exchanges in domain decomposition codes. Supporting efficient shift-based communication within a job thus provides good performance for a number of common MPI operations.« less
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom