Space‐address decoupled scratchpad memory management for neural network accelerators | Zendy

Zhang Zhenxing | Zendy; Sun Shiyan | Zendy; Chen Xunyu | Zendy; Zhi Tian | Zendy; Guo Qi | Zendy; Chen Yunji | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Space‐address decoupled scratchpad memory management for neural network accelerators

Author(s) -

Zhang Zhenxing,

Sun Shiyan,

Chen Xunyu,

Zhi Tian,

Guo Qi,

Chen Yunji

Publication year - 2020

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.6046

Subject(s) - computer science , compile time , compiler , memory management , computation , artificial neural network , computer engineering , process (computing) , variable (mathematics) , high memory , convolutional neural network , auxiliary memory , software , parallel computing , energy consumption , memory model , distributed computing , artificial intelligence , computer hardware , shared memory , operating system , programming language , semiconductor memory , mathematical analysis , mathematics , ecology , biology

Summary Deep neural networks have been demonstrated to be useful in varieties of intelligent tasks, and various specialized NN accelerators have been proposed recently to improve the hardware efficiency, which are typically equipped with software‐managed scratchpad memory (SPM) for high performance and energy efficiency. However, traditional SPM management techniques cause memory fragmentation for NN accelerators, and thus lead to low utilization of precious SPM. The main reason is that traditional techniques are originally designed for managing fixed‐length registers rather than variable‐length memory blocks . In this article, we propose a novel SPM management approach for NN accelerators. The basic intuition is that NN computation/memory behaviors are predictable and relatively regular compared with traditional applications, and thus most information can be determined at compile time. In addition, by exploiting the variable‐length feature of SPM, we propose to divide the allocation process into two passes: the space assignment and the address assignment pass, which are simultaneously (and implicitly) performed in traditional one‐pass allocation techniques. Experimental results on the memory requests of a representative NN accelerator demonstrate that the proposed approach can significantly reduce the memory consumption by 30% at most compared with state‐of‐the‐art SPM management techniques, and the memory usage is only 2% larger than that of the theoretical optimal allocation.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore