Exploring the benefits of heterogeneous computing to accelerate face detection deep learning inference
Author(s) -
G C Gutiérrez
Publication year - 2017
Language(s) - English
Resource type - Dissertations/theses
DOI - 10.17760/d20260296
Subject(s) - deep learning , computer science , inference , artificial intelligence , machine learning , face (sociological concept) , face detection , throughput , facial recognition system , class (philosophy) , pattern recognition (psychology) , telecommunications , social science , sociology , wireless
of the Thesis Exploring the Benefits of Heterogeneous Computing to Accelerate Face Detection Deep Learning Inference by Julian Gutierrez Master of Science in Electrical and Computer Engineering Northeastern University, September 2017 David Kaeli, Ph.D, Advisor Significant improvements in face detection accuracy have been achieved by an emerging class of deep learning algorithms. Despite the capability of these algorithms to achieve high accuracy, deep learning approaches can be computationally prohibitive. As a result, we need to trade off high accuracy with processing throughput, meaning robust face detection in real-time for full HD video streams is not possible today. To overcome this challenge, we propose a parallel pipelined framework that enables efficient usage of our heterogeneous platform. We implement this pipeline framework using a state-of-the-art algorithm, exploiting the CPU and GPU available resources through C++ libraries, including pthreads and OpenCV, and use Caffe and cuDNN libraries to implement our deep learning models. Our framework is capable of handling full HD video workloads in real-time, assuming typical video scenarios. We achieve a 2.4x faster frame-rate as compared to a sequential implementation that is GPU enabled. We are also capable of achieving up to 110 FPS for a standard definition video, while still retaining the high accuracy of the original algorithm. The resulting pipelined framework has a high degree of flexibility, enabling us to consider a range of deep learning algorithms as we try to map deep neural networks to a powerful CPU-GPU platform.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom