z-logo
open-access-imgOpen Access
Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks
Author(s) -
Dmitry Chudakov,
Alexander Goncharenko,
Sergey Alyamkin,
A Densidov
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2134/1/012004
Subject(s) - quantization (signal processing) , trellis quantization , computer science , linde–buzo–gray algorithm , artificial neural network , algorithm , mathematics , speech recognition , artificial intelligence , image processing , image compression , image (mathematics)
Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here