Efficiency of group implicit concurrent algorithms for transient finite element analysis | Zendy

Ortiz M. | Zendy; Sotelino E. D. | Zendy; NourOmid B. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Efficiency of group implicit concurrent algorithms for transient finite element analysis

Author(s) -

Ortiz M.,

Sotelino E. D.,

NourOmid B.

Publication year - 1989

Publication title -

international journal for numerical methods in engineering

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.421

H-Index - 168

eISSN - 1097-0207

pISSN - 0029-5981

DOI - 10.1002/nme.1620281204

Subject(s) - computation , hypercube , partition (number theory) , finite element method , speedup , parallel computing , reduction (mathematics) , group (periodic table) , computer science , transient (computer programming) , element (criminal law) , algorithm , mathematics , combinatorics , physics , geometry , quantum mechanics , operating system , thermodynamics , political science , law

The performance of group implicit algorithms is assessed on actual concurrent computers. We show that, as the number of subdomains is increased, performance enhancements are derived from two sources: the increased parallelism in the computations; and a reduction in equation solving effort. Moreover, we show that these two performance enhancements are synergistic, in the sense that the corresponding speed‐ups are multiplied , rather than merely added . Our numerical simulations demonstrate that, if n is the number of degrees of freedom of the structure, p the number of processors used in the computations, and s ⩾ p is the number of subdomains in the partition, the net speed‐up is \documentclass{article}\pagestyle{empty}\begin{document}$ O\left({p\sqrt s} \right) $\end{document} in 2D and O ( ps ) in 3D, asymptotically as n / s → ∞. In particular, speed‐ups with respect to Newmark's method of \documentclass{article}\pagestyle{empty}\begin{document}$ O\left({p\sqrt s} \right) $\end{document} in 2D and O ( s ) in 3D are obtained on a single‐processor machine. Finally, simulations on a 32‐node hypercube are presented for which the interprocessor communication efficiencies obtained are consistently in excess of 90 per cent.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore