z-logo
Premium
Profiling divergences in GPU applications
Author(s) -
Coutinho Bruno,
Sampaio Diogo,
Pereira Fernando M. Q.,
Meira Wagner
Publication year - 2012
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.2853
Subject(s) - profiling (computer programming) , computer science , parallel computing , computational science , programming language
SUMMARY The increasing programmability and the high computational power of graphics processing units make them attractive to general purpose programming. However, taking full benefit of this execution environment is a challenging task. One of these challenges stems from divergences, a phenomenon that occurs when threads that execute in lock‐step are forced to take different program paths because of branches in the code. In face of divergences, some threads will have to wait, idly, while their diverging siblings execute. Optimizing the code to avoid divergences is difficult because this task demands a deep understanding of programs that might be large and convoluted. To facilitate the detection of divergences, this paper introduces the divergence map, a data structure that indicates the location and the volume of divergences in a program. We build this map via dynamic profiling techniques, which we have implemented on top of an open source Parallel Thread Execution compiler. To illustrate the importance of the divergence map, we have used it to pinpoint the core regions that must be optimized in well‐known public applications. By hand optimizing some applications, we have added 9–11% speedups onto kernels that have already gone through the sieve of many programmers. Copyright © 2012 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here