z-logo
Premium
Algorithmic advances in parallel architectures and energy‐efficient computing
Author(s) -
Wyrzykowski Roman,
Szymanski Boleslaw K.
Publication year - 2019
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.5260
Subject(s) - computer science , concurrency , programming language
This special issue of Concurrency and Computation: Practice and Experience contains revised and extended versions of selected papers presented at the 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017), which was held on September 10-13, 2017, in Lublin, Poland. PPAM is a biennial series of international conferences dedicated to exchanging ideas between researchers involved in parallel and distributed computing, including theory and applications, as well as applied and computational mathematics. The focus of PPAM 2017 was on models, algorithms, and software tools that facilitate efficient and convenient use of modern parallel and distributed computing systems, as well as on large-scale applications, including data-intensive and machine learning problems. PPAM 2017 was organized by the Department of Computer and Information Science of Czestochowa University of Technology in Czestochowa, Poland, together with Maria Curie-Sklodowska University in Lublin, Poland, under the patronage of the Committee of Informatics of the Polish Academy of Sciences, in cooperation with the ICT COST Action IC1305 ‘‘Network for Sustainable Ultrascale Computing (NESUS)’’. This meeting gathered more than 170 participants from 25 countries. The accepted papers were presented at the regular tracks of the PPAM 2017 conference, as well as during the workshops. A strict reviewing process, with each submission evaluated by at least three reviewers, resulted in acceptance of 100 contributed papers for publication in the conference proceedings, while approximately 42% of the submissions were rejected. For regular tracks, 49 papers were selected from 98 submissions, resulting in an acceptance rate of 50%. Based on the results of the reviews, selected papers were recommended for a special journal issue. Besides quality, another important criterion for selection was each paper contribution to thematic consistency of the issue. The main focus of this special issue is on algorithmic advances in matching the software properties to the targeted parallel architecture, including graphics processing unit (GPU) accelerators and clusters. These advances are crucial for parallelizing successfully such complex applications as simulating granular flows, solving nonsingular systems, electronic transport simulations, solving three-dimensional fractional power diffusion problems, dynamic programming, computer graphics, parallel event-driven simulation, and others. A complementary topic of this issue is energy-efficient computing, since the energy consumption has become a limiting factor for high-performance computing (HPC) applications in recent years. The authors of selected papers were contacted after the conference and invited to submit revised and extended versions of their works. These new versions were reviewed independently again by at least three reviewers. Finally, ten contributions were accepted for publication. They are summarized below. The work of Krestenitis and Weinzierl1 focuses on simulating granular flows using discrete element method models. This problem is computationally challenging— a bottleneck arises when identifying all particle contact points per time steps. To introduce concurrency to particle comparisons, while keeping their number low, the authors propose a tree-based multilevel metadata structure to manage the particles, as well as a novel scheme of identifying the contact points. Furthermore, a novel adaptivity criterion allows an explicit time stepping technique to work with comparably large time steps. The fusion of the proposed developments yields promising speedups for maximally asynchronous task-based realizations. This work shows that new computer architectures can push the boundary of such many-particle simulations by choosing the right data structures and data processing schemes. An efficient algorithm for the parallel robust solution of triangular linear systems is presented in the paper by Mikkelsen et al2. Such systems are central to the solution of general linear systems and computation of eigenvectors, using either forward or backward substitution. However, there are well-conditioned systems for which substitution fails due to overflow. This paper presents novel algorithms that are blocked and parallel, while dynamically scaling the solution and right-hand side values to avoid overflows. A new task-based parallel robust solver Kiya is developed and compared against LAPACK solvers. When there are many complex right-hand sides, Kiya performs significantly better than the robust solver DLATRS and is not significantly slower than the nonrobust solver DTRSM. The algorithm developed in the work of Spellacy et al3 extends previous work on inversion of block tridiagonal matrices from the Hermitian/symmetric case to the general case, with variable sub-block sizes. The presented investigation is motivated by the requirements of atomic and molecular–scale electronic transport simulations, in particular, the SMEAGOL electronic transport code. A parallel divide-and-conquer approach is used to develop a novel algorithm, which is then implemented in Fortran with message passing interface. Its benefits in terms of runtimes and memory footprint are examined when compared against inverses obtained using the well-known libraries ScaLAPACK and MUMPS.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here