z-logo
open-access-imgOpen Access
The performance of GRAPE-DR for dense matrix operations
Author(s) -
Junichiro Makino,
Hiroshi Daisaka,
Toshiyuki Fukushige,
Yutaka Sugawara,
Mary Inaba,
Kei Hiraki
Publication year - 2011
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2011.04.094
Subject(s) - computer science , matrix (chemical analysis) , composite material , materials science
We describe the implementation and performance of dense matrix multiplication and LU decomposition on the GRAPE-DR SIMD accelerator board. A GRAPE-DR card, with 4 GRAPE-DR chips, has the theoretical peak DP performance of 819 Gflops. Each GRAPE-DR chip has 512 processing elements and operates with 400MHz clock cycle. each PE can perform one addition and one multiplication in every two clock cycles. The measured performance of matrix multiplication is 730 Gflops for the multiplication of matrices with size 51200 by 2048 and 2048 by 51200. The performance of LU decomposition is 480 Gflops for the problem size of 51200

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom