Understanding memory effects in the automated generation of optimized matrix algebra kernels
Author(s) -
Elizabeth R. Jessup,
Ian Karlin,
Erik Silkensen,
Geoffrey Belter,
Jeremy G. Siek
Publication year - 2010
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2010.04.209
Subject(s) - computer science , compiler , loop fusion , modular design , parallel computing , set (abstract data type) , matrix (chemical analysis) , linear algebra , loop (graph theory) , memory model , theoretical computer science , programming language , computer architecture , shared memory , materials science , geometry , mathematics , combinatorics , composite material
Efficient implementation of matrix algebra is important to the performance of many large and complex physical models. Among important tuning techniques is loop fusion which can reduce the amount of data moved between memory and the processor. We have developed the Build to Order (BTO) compiler to automate loop fusion for matrix algebra kernels. In this paper, we present BTO’s analytic memory model which substantially reduces the number of loop fusion options considered by the compiler. We introduce an example that motivates the inclusion of registers in the model. We demonstrate how the model’s modular design facilitates the addition of register allocation to the model’s set of memory components, improving its accuracy
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom