
Importance of Some Specifications of Heterogeneous Architectures (CPU+GPU) for 3D Cone-Beam-CT Image Reconstruction Using OpenCL
Author(s) -
T. Nouioua,
Ahmed Hafid Belbachir
Publication year - 2021
Publication title -
international journal of biology and biomedical engineering
Language(s) - English
Resource type - Journals
ISSN - 1998-4510
DOI - 10.46300/91011.2021.15.33
Subject(s) - computer science , speedup , acceleration , computation , hardware acceleration , parallel computing , memory bandwidth , work (physics) , computational science , computer hardware , field programmable gate array , algorithm , mechanical engineering , physics , classical mechanics , engineering
Medical imaging has found an important way for routine daily practice using cone-beam computed tomography to reconstruct a 3D volume image using the Feldkamp-Davis-Kress (FDK) algorithm. This way can minimize the patient’s time exposure to X-rays. However, its implementation is very costly in computation time, which constitutes a handicap problem in practice. For this reason, the use of acceleration methods on GPU becomes a real solution. For the acceleration of the FDK algorithm, we have used the GPU on heterogeneous platforms. To take full advantage of the GPU, we have chosen useful features of the GPUs and, we have launched the acceleration of the reconstruction according to some technical criteria, namely the work-groups and the work-items. We have found that the number of parallel cores, as well as the memory bandwidth, have no effect on runtimes speedup without being rough in the choice of the number of work-items, which represents a real challenge to master in order to be able to divide them efficiently into work-groups according to the device specifications considered as principal difficulties if we do not study technically the GPU as a hardware device. After an optimized implementation using kernels launched optimally on GPU, we have deduced that the high capacities of the devices must be chosen with a rough optimization of the work-items which are divided into several work-groups according to the hardware limitations.