
A Survey on System-Level Design of Neural Network Accelerators
Author(s) -
Kazuto Seto
Publication year - 2021
Publication title -
jics. journal of integrated circuits and systems
Language(s) - English
Resource type - Journals
eISSN - 1872-0234
pISSN - 1807-1953
DOI - 10.29292/jics.v16i2.505
Subject(s) - loop unrolling , computer science , convolutional neural network , nested loop join , inference , parallel computing , computation , loop (graph theory) , computer architecture , computer engineering , artificial intelligence , algorithm , programming language , compiler , mathematics , combinatorics
In this paper, we present a brief survey on the system-level optimizations used for convolutional neural network (CNN) inference accelerators. For the nested loop of convolutional (CONV) layers, we discuss the effects of loop optimizations such as loop interchange, tiling, unrolling and fusion on CNN accelerators. We also explain memory optimizations that are effective with the loop optimizations. In addition, we discuss streaming architectures and single computation engine architectures that are commonly used in CNN accelerators. Optimizations for CNN models are briefly explained, followed by the recent trends and future directions of the CNN accelerator design.