Premium
Minimizing development and maintenance costs in supporting persistently optimized BLAS
Author(s) -
Whaley R. Clint,
Petitet Antoine
Publication year - 2005
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.626
Subject(s) - computer science , implementation , parallel computing , linear algebra , kernel (algebra) , set (abstract data type) , variety (cybernetics) , computer architecture , computational science , theoretical computer science , programming language , artificial intelligence , mathematics , geometry , combinatorics
Abstract The Basic Linear Algebra Subprograms (BLAS) define one of the most heavily used performance‐critical APIs in scientific computing today. It has long been understood that the most important of these routines, the dense Level 3 BLAS, may be written efficiently given a highly optimized general matrix multiply routine. In this paper, however, we show that an even larger set of operations can be efficiently maintained using a much simpler matrix multiply kernel. Indeed, this is how our own project, ATLAS (which provides one of the most widely used BLAS implementations in use today), supports a large variety of performance‐critical routines. Copyright © 2004 John Wiley & Sons, Ltd.