
Performance Optimization on GPGPU & Multicore CPU Using Roofline Model
Author(s) -
Noor M. Allayla,
Shefa A. Dawwd
Publication year - 2021
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/1152/1/012021
Subject(s) - mnist database , multi core processor , computer science , parallel computing , cuda , general purpose computing on graphics processing units , central processing unit , core (optical fiber) , artificial neural network , embedded system , computer hardware , artificial intelligence , operating system , graphics , telecommunications
The roofline model introduced in this paper to evaluate the best optimized platform for training the neural network that used to recognize handwritten digits under multicore CPU and general-purpose GPU (GPGPU) as hardware environment. The pattern parallel training technique for MNIST dataset is applied. The parallel network training of MNIST using different data layout of multicore CPU and GPGPU is presented. Different bottlenecks have been explained by applying the roofline model. The most suitable platform is selected according to layouts and constrains either for memory or computation bounds. The computational intensity of all rooflines is moved toward right, then the performance is increased. As a result of optimization, and with the diversity of the available data size, core number, operational strength, the most suitable hardware platform is selected.