z-logo
open-access-imgOpen Access
The Case for Error-Bounded Lossy Floating-Point Data Compression on Interconnection Networks
Author(s) -
Yao Hu,
Michihiro Koibuchi
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5121/csit.2021.110706
Subject(s) - lossy compression , computer science , lossless compression , compression ratio , interconnection , fast fourier transform , data compression , parallel computing , infiniband , data compression ratio , algorithm , image compression , computer network , artificial intelligence , image processing , automotive engineering , engineering , image (mathematics) , internal combustion engine
Data compression virtually increases the effective network bandwidth on an interconnection network of parallel computers. Although a floating-point dataset is frequently exchanged between compute nodes in parallel applications, its compression ratio often becomes low when using simple lossless compression algorithms. In this study, we aggressively introduce a lossy compression algorithm for floating-point values on interconnection networks. We take an application-level compression for providing high portability: a source process compresses communication datasets at an MPI parallel program, and a destination process decompresses them. Since recent interconnection networks are latency-sensitive, sophisticated lossy compression techniques that introduce large compression overhead are not suitable for compressing communication data. In this context, we apply a linear predictor with the userdefined error bound to the compression of communication datasets. We design, implement, and evaluate the compression technique for the floating-point communication datasets generated in MPI parallel programs, i.e., Ping Pong, Himeno, K-means Clustering, and Fast Fourier Transform (FFT). The proposed compression technique achieves 2.4x, 6.6x, 4.3x and 2.7x compression ratio for Ping Pong, Himeno, K-means and FFT at the cost of the moderate decrease of quality of results (error bound is 10-4 ), thus achieving 2.1x, 1.7x, 2.0x and 2.4x speedup of the execution time, respectively. More generally, our cycle-accurate network simulation shows that a high compression ratio provides comparably low communication latency, and significantly improves effective network throughput on typical synthetic traffic patterns when compared to no data compression on a conventional interconnection network.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here