Premium
Hungarian algorithm for subcarrier assignment problem using GPU and CUDA
Author(s) -
Yadav Satyendra Singh,
Lopes Paulo Alexandre Crisóstomo,
Ilic Aleksandar,
Patra Sarat Kumar
Publication year - 2018
Publication title -
international journal of communication systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.344
H-Index - 49
eISSN - 1099-1131
pISSN - 1074-5351
DOI - 10.1002/dac.3884
Subject(s) - computer science , cuda , speedup , subcarrier , parallel computing , computational complexity theory , graphics processing unit , reduction (mathematics) , general purpose computing on graphics processing units , resource allocation , graphics , algorithm , orthogonal frequency division multiplexing , channel (broadcasting) , computer network , geometry , mathematics , computer graphics (images)
Summary General purpose graphics processing units (GPGPUs) have gained much popularity in scientific computing to speedup computational intensive workloads. Resource allocation in terms of power and subcarriers assignment, in current wireless standards, is one of the challenging problems due to its high computational complexity requirement. The Hungarian algorithm (HA), which has been extensively applied to linear assignment problems (LAPs), has been seen to provide encouraging result in resource allocation for wireless communication systems. This paper presents a compute unified device architecture (CUDA) implementation of the HA on graphics processing unit (GPU) for this problem. HA has been implemented on a parallel architecture to solve the subcarrier assignment problem and maximize spectral efficiency. The proposed implementation is achieved by using the “Kuhn‐Munkres” algorithm with effective modifications, in order to fully exploit the capabilities of modern GPU devices. A cost matrix for maximum assignment has been defined leading to a low complexity matrix compression along with highly optimized CUDA reduction and parallel alternating path search process. All these optimizations lead to an efficient implementation with superior performance when compared with existing parallel implementations.