
Improving SIMD Utilization with Thread‐Lane Shuffled Compaction in GPGPU
Author(s) -
Li Bingchao,
Wei Jizeng,
Guo Wei,
Sun Jizhou
Publication year - 2015
Publication title -
chinese journal of electronics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.267
H-Index - 25
eISSN - 2075-5597
pISSN - 1022-4653
DOI - 10.1049/cje.2015.10.004
Subject(s) - simd , parallel computing , thread (computing) , computer science , general purpose computing on graphics processing units , compaction , computer graphics (images) , geology , operating system , graphics , geotechnical engineering
GPGPUs adopt SIMT execution model in which each logical thread in a warp corresponds to a SIMD lane while can still follow an independent control ow. When a branch divergence appears and threads within a warp take dierent execution paths, GPGPUs have to execute each path serially through SIMD lane masking, which potentially decreases the SIMD utilization and performance. We propose an ecient thread compaction mechanism to handle branch divergence with a novel register le structure. We also develop a new thread scheduling policy cooperating with our compaction mechanism. The simulation results show that our approach improves the SIMD utilization up to 74.4% and achieves a maximum 11.1% performance speedup with small hardware overhead.