Premium
On the vectorization of finite element codes for high‐performance computers
Author(s) -
Zhang H.,
Schwartz F. W.,
Sudicky E. A.
Publication year - 1994
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1029/94wr02269
Subject(s) - vectorization (mathematics) , parallel computing , speedup , computer science , subroutine , node (physics) , solver , finite element method , vector processor , code (set theory) , supercomputer , computational science , algorithm , physics , set (abstract data type) , quantum mechanics , thermodynamics , programming language , operating system
This paper presents strategies for vectorizing finite element codes in simulating large groundwater flow and transport problems. The approaches take advantage of vector‐processing capabilities of the Cray Y‐MP by regulating the node‐element and node‐node relationships. Regularization is achieved by adding auxiliary nodes and elements around the simulation domain. Vectorization of the global matrix assembly is due solely to the regularity of incidence matrix definition, while the vectorization of the iterative solver takes advantage of the regularity of the node‐node relationship and the concepts of wavefronts. The vectorization schemes are illustrated using the code VapourT. Rectangular elements are also added to the VapourT in addition to its original triangular elements. Test runs of the vectorized code with the vector processor turned on (VT4) versus the original code with the vector processor turned off (VTl) for triangular elements indicate overall speedups of 6–10.93 times in terms of CPU seconds. Part of the speedup results from the ability to eliminate some addressing subroutines. However, most of the speedup is due to the vectorization scheme. The speedups purely due to vectorization for the vectorized code (from VT3 to VT4) are 3.51–5.81 times in terms of CPU seconds for triangular elements and 4.28–7.25 (from VT5 to VT6) for rectangular elements.