
CNQ: Compressor‐Based Non‐uniform Quantization of Deep Neural Networks
Author(s) -
Yuan Yong,
Chen Chen,
Hu Xiyuan,
Peng Silong
Publication year - 2020
Publication title -
chinese journal of electronics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.267
H-Index - 25
eISSN - 2075-5597
pISSN - 1022-4653
DOI - 10.1049/cje.2020.09.014
Subject(s) - quantization (signal processing) , artificial neural network , computer science , deep neural networks , gas compressor , artificial intelligence , algorithm , engineering , aerospace engineering
Deep neural networks (DNNs) have achieved state‐of‐the‐art performance in a number of domains but suffer intensive complexity. Network quantization can effectively reduce computation and memory costs without changing network structure, facilitating the deployment of DNNs on mobile devices. While the existing methods can obtain good performance, low‐bit quantization without time‐consuming training or access to the full dataset is still a challenging problem. In this paper, we develop a novel method named Compressorbased non‐uniform quantization (CNQ) method to achieve non‐uniform quantization of DNNs with few unlabeled samples. Firstly, we present a compressor‐based fast nonuniform quantization method, which can accomplish nonuniform quantization without iterations. Secondly, we propose to align the feature maps of the quantization model with the pre‐trained model for accuracy recovery. Considering the property difference between different activation channels, we utilize the weighted‐entropy perchannel to optimize the alignment loss. In the experiments, we evaluate the proposed method on image classification and object detection. Our results outperform the existing post‐training quantization methods, which demonstrate the effectiveness of the proposed method.