z-logo
open-access-imgOpen Access
Research on Dynamic Reconfiguration Technology of Neural Network Accelerator Based on Zynq
Author(s) -
Hao Lv,
Shengbing Zhang,
Xiaojian Liu,
Shuo Liu,
Yongqiang Liu,
Wei Han,
Shiyi Xu
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1650/3/032093
Subject(s) - reconfigurability , field programmable gate array , computer science , control reconfiguration , convolutional neural network , embedded system , hardware acceleration , computer architecture , multiplexing , reconfigurable computing , computer hardware , artificial intelligence , operating system , telecommunications
Target detection based on convolutional neural network is a research hotspot in the field of computer vision. Conventional neural network (CNN) accelerators use the time division multiplexing method, and different network layers use the same accelerator, and their adaptability and resource utilization are not high. How to combine the dynamic reconfigurable characteristics of FPGA so that the calculation of each layer can be matched with the corresponding accelerator architecture at the cost of a certain configuration delay, and to improve the utilization efficiency of computing resources is a research hotspot. This article takes the YOLOv2 target detection algorithm widely used in the industry as an example, and uses Xilinx’s Zynq as the platform to describe the process of mapping the CNN model to the FPGA. Combined with the dynamic reconfigurability of FPGA, the calculation of each layer can be matched with the reconstructed accelerator architecture at the cost of a certain configuration delay, and the reconstruction delay can be shared by batch data multiplexing accelerator architecture, which effectively improves In order to improve the accelerator performance, the convolutional layer and the cascaded maximum pooling layer are merged to reduce the memory access delay. Experiments and evaluations were carried out on the accelerator architecture combined with dynamic reconfigurable characteristics, and the performance of 30.35GOP/s was obtained on the Zynq platform. Provide a reference for the application and optimization of CNN on embedded platforms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here