An Analysis on Very Deep Convolutional Neural Networks: Problems and Solutions | Zendy

Tidor-Vlad Pricope | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

An Analysis on Very Deep Convolutional Neural Networks: Problems and Solutions

Author(s) -

Tidor-Vlad Pricope

Publication year - 2021

Publication title -

studia universitatis babeş-bolyai. informatica

Language(s) - English

Resource type - Journals

eISSN - 2065-9601

pISSN - 1224-869X

DOI - 10.24193/subbi.2021.1.01

Subject(s) - computer science , normalization (sociology) , computation , convolutional neural network , residual , artificial intelligence , deep neural networks , deep learning , artificial neural network , algorithm , pattern recognition (psychology) , sociology , anthropology

Neural Networks have become a powerful tool in computer vision because of the recent breakthroughs in computation time and model architecture. Very deep models allow for better deciphering of the hidden patterns in the data; however, training them successfully is not a trivial problem, because of the notorious vanishing/exploding gradient problem. We illustrate this problem on VGG models, with 8 and 38 hidden layers, on the CIFAR100 image dataset, where we visualize how the gradients evolve during training. We explore known solutions to this problem like Batch Normalization (BatchNorm) or Residual Networks (ResNets), explaining the theory behind them. Our experiments show that the deeper model suffers from the vanishing gradient problem, but BatchNorm and ResNets do solve it. The employed solutions slighly improve the performance of shallower models as well, yet, the fixed deeper models outperform them.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore