GPU‐accelerated backtracking using CUDA Dynamic Parallelism | Zendy

Carneiro Pessoa Tiago | Zendy; Gmys Jan | Zendy; Carvalho Júnior Francisco Heron | Zendy; Melab Nouredine | Zendy; Tuyttens Daniel | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

GPU‐accelerated backtracking using CUDA Dynamic Parallelism

Author(s) -

Carneiro Pessoa Tiago,

Gmys Jan,

Carvalho Júnior Francisco Heron,

Melab Nouredine,

Tuyttens Daniel

Publication year - 2017

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.4374

Subject(s) - backtracking , computer science , cuda , parallel computing , kernel (algebra) , parallelism (grammar) , tree (set theory) , look ahead , travelling salesman problem , divide and conquer algorithms , general purpose computing on graphics processing units , computation , algorithm , mathematics , graphics , mathematical analysis , computer graphics (images) , combinatorics

Summary New GPGPU technologies, such as CUDA Dynamic Parallelism (CDP), can help dealing with recursive patterns of computation, such as divide‐and‐conquer, used by backtracking algorithms. In this paper, we propose a GPU‐accelerated backtracking algorithm using CDP that extends a well‐known parallel backtracking model. The search starts on CPU, processing the search tree until a first cutoff depth. Based on this partial backtracking tree, the algorithm analyzes the memory requirements of subsequent kernel generations. The proposed algorithm performs no dynamic allocation of memory on GPU, unlike related works from the literature. The proposed algorithm has been extensively tested using the N‐Queens Puzzle problem and instances of the Asymmetric Traveling Salesman Problem (ATSP) as test‐cases. The proposed CDP algorithm may, under some conditions, outperform its non‐CDP counterpart by a factor up to 25. But, it may also be up to twice slower. The CDP‐based implementation has much better worst case execution times and makes algorithm's performance less dependent on the tuning of parameters. Compared to other CDP‐based strategies from the literature, the proposed algorithm is on average 8× faster. The proposed algorithm is also hybridized with another CDP‐based strategy from the literature. The combination of strategies is in average 4.5× faster than the related strategy. We also identify some difficulties, limitations, and bottlenecks concerning the CDP programming model which may be useful for helping potential users.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research