Open Access
Hardware Pipelining of Repetitive Patterns in Processor Instruction Traces
Author(s) -
João Bispo,
Jaime S. Cardoso,
João L. Monteiro
Publication year - 2020
Publication title -
jics. journal of integrated circuits and systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.125
H-Index - 11
eISSN - 1872-0234
pISSN - 1807-1953
DOI - 10.29292/jics.v8i1.373
Subject(s) - computer science , coprocessor , parallel computing , speedup , software pipelining , dataflow , computation , set (abstract data type) , instruction set , algorithm , software , operating system , programming language
Dynamic partitioning is a promising technique where computations are transparently moved from a Gene-
ral Purpose Processor (GPP) to a coprocessor during application execution. To be effective, the mapping
of computations to the coprocessor needs to consider aggressive optimizations. One of the mapping opti-
mizations is loop pipelining, a technique extensively studied and known to allow substantial performance
improvements. This paper describes a technique for pipelining Megablocks, a type of runtime loop deve-
loped for dynamic partitioning. The technique transforms the body of Mega-blocks into an acyclic dataflow
graph which can be fully pipe-lined and is based on the atomic execution of loop iterations. For a set of 9 ben-
chmarks without memory operations, we generated pipelined hardware versions of the loops and esti-mate
that the presented loop pipelining technique increases the average speedup of non-pipelined coprocessor
accelerated designs from 1.6× to 2.2×. For a larger set of 61 benchmarks which include memory operations,
we estimate through simulation a speedup increase from 2.5× to 5.6× with this technique.