Kernel-controlled DQN based CNN Pruning for Model Compression and Acceleration | Zendy

Romancha Khatri | Zendy; Kwanghee Won | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Kernel-controlled DQN based CNN Pruning for Model Compression and Acceleration

Author(s) -

Romancha Khatri,

Kwanghee Won

Publication year - 2020

Publication title -

open prairie (south dakota state university)

Language(s) - English

Resource type - Conference proceedings

DOI - 10.1145/3400286.3418258

Subject(s) - computer science , compression (physics) , pruning , kernel (algebra) , convolutional neural network , compression ratio , mnist database , algorithm , convolution (computer science) , artificial neural network , artificial intelligence , data compression ratio , image compression , mathematics , image (mathematics) , image processing , materials science , engineering , combinatorics , automotive engineering , agronomy , composite material , biology , internal combustion engine

Apart from the accuracy, the size of Convolutional Neural Networks (CNN) model is another principal factor for facilitating the deployment of models on memory, power and budget constrained devices. Conventional compression techniques require human expert to setup parameters to explore the design space and iterative based pruning requires heavy training which is sub-optimal and time consuming. Given a CNN model, we propose deep reinforcement learning [8] DQN based automated compression which effectively turned off kernels on each layer by observing its significance. Observing accuracy, compression ratio and convergence rate, proposed DQN model can automatically re- activate the healthiest kernels back to train it again to regain accuracy which greatly ameliorate the model compression quality. Based on experiments on MNIST [3] dataset, our method can compress convolution layers for VGG-like [10] model up to 60% with 0.5% increase in test accuracy within less than a half the number of initial amount of training (speed-up up to 2.5×), state- of-the-art results of dropping 80% of kernels (compressed 86% parameters) with increase in accuracy by 0.14%. Further dropping 84% of kernels (compressed 94% parameters) with the loss of 0.4% accuracy. The first proposed Auto-AEC (Accuracy-Ensured Compression) model can compress the network by preserving original accuracy or increase in accuracy of the model, whereas, the second proposed Auto-CECA (Compression-Ensured Considering the Accuracy) model can compress to the maximum by preserving original accuracy or minimal drop of accuracy. We further analyze effectiveness of kernels on different layers based on how our model explores and exploits in various stages of training.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research