Accelerating Training of Deep Neural Networks on GPU using CUDA | Zendy

D.T.V. Dharmajee Rao | Zendy; K.V. Ramana | Zendy

AI Assistant Blog Pricing

Open Access

Accelerating Training of Deep Neural Networks on GPU using CUDA

Author(s) -

D.T.V. Dharmajee Rao,

K.V. Ramana

Publication year - 2019

Publication title -

international journal of intelligent systems and applications

Language(s) - English

Resource type - Journals

eISSN - 2074-9058

pISSN - 2074-904X

DOI - 10.5815/ijisa.2019.05.03

Subject(s) - cuda , computer science , artificial neural network , deep learning , backpropagation , artificial intelligence , multiplication (music) , matrix multiplication , parallel computing , multi core processor , general purpose computing on graphics processing units , machine learning , graphics , acoustics , quantum mechanics , physics , computer graphics (images) , quantum

The development of fast and efficient training algorithms for Deep Neural Networks has been a subject of interest over the past few years because the biggest drawback of Deep Neural Networks is enormous cost in computation and large time is consumed to train the parameters of Deep Neural Networks. This aspect motivated several researchers to focus on recent advancements of hardware architectures and parallel programming models and paradigms for accelerating the training of Deep Neural Networks. We revisited the concepts and mechanisms of typical Deep Neural Network training algorithms such as Backpropagation Algorithm and Boltzmann Machine Algorithm and observed that the matrix multiplication constitutes major portion of the work-load for the Deep Neural Network training process because it is carried out for a huge number of times during the training of Deep Neural Networks. With the advent of many-core GPU technologies, a matrix multiplication can be done very efficiently in parallel and this helps a lot training a Deep Neural Network not consuming time as it used to be a few years ago. CUDA is one of the high performance parallel programming models to exploit the capabilities of modern many-core GPU systems. In this paper, we propose to modify Backpropagation Algorithm and Boltzmann Machine Algorithm with CUDA parallel matrix multiplication and test on many-core GPU system. Finally we discover that the planned strategies achieve very quick training of Deep Neural Networks than classic strategies.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom

About

About Careers Publisher Partners Contact Us Our institutional solutions Get Organisational Trial or Quote

Learn

FAQs Blog Terms of Use Privacy Policy

Download the Zendy App

Discover

Explore

Home ZAIA Blog