z-logo
Premium
Creating optimal code for GPU‐accelerated CT reconstruction using ant colony optimization
Author(s) -
Papenhausen Eric,
Zheng Ziyi,
Mueller Klaus
Publication year - 2013
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1118/1.4773045
Subject(s) - computer science , ant colony optimization algorithms , graph , heuristics , memory footprint , code (set theory) , process (computing) , cuda , computer engineering , parallel computing , algorithm , theoretical computer science , set (abstract data type) , programming language , operating system
Purpose: CT reconstruction algorithms implemented on the GPU are highly sensitive to their implementation details and the hardware they run on. Fine‐tuning an implementation for optimal performance can be a time consuming task and require many updates when the hardware changes. There are some techniques that do automatic fine‐tuning of GPU code. These techniques, however, are relatively narrow in their fine‐tuning and are often based on heuristics which can be inaccurate. The goal of this paper is to present a framework that will automate the process of code optimization with maximum flexibility and produce a final result that is efficient and readable to the user.Methods: The authors propose a method that is able to tune high level implementation details by using the ant colony optimization algorithm to find the optimal implementation in a relatively short amount of time. Our framework does this by taking as input, a file that describes a graph, such that a path through this graph represents a potential implementation. They then use the ant colony optimization algorithm to find the optimal path through this graph based on the execution time and the quality of the image.Results: Two experimental studies are carried out. Using the presented framework, they optimize the performance of a GPU accelerated FDK backprojection implementation and a GPU accelerated separable footprint backprojection implementation. The authors demonstrate that the resulting optimal implementation can be different depending on the hardware specifications. They then compare the results of the framework produced with the results produced by manual optimization.Conclusions: The framework they present is a useful tool for increasing programmer productivity and reducing the overhead of leveraging hardware specific resources. By performing an intelligent search, our framework produces a more efficient image reconstruction implementation in a shorter amount of time.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here