Online Thread and Data Mapping Using a Sharing-Aware Memory Management Unit
Author(s) -
Eduardo H. M. Cruz,
Matthias Diener,
Laércio Lima Pilla,
Philippe O. A. Navaux
Publication year - 2020
Publication title -
acm transactions on modeling and performance evaluation of computing systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.291
H-Index - 11
eISSN - 2376-3647
pISSN - 2376-3639
DOI - 10.1145/3433687
Subject(s) - computer science , memory hierarchy , parallel computing , thread (computing) , uniform memory access , locality of reference , interleaved memory , parsec , memory map , shared memory , non uniform memory access , locality , suite , data access , cpu cache , distributed computing , computer architecture , memory management , cache , operating system , cache coloring , overlay , database , stars , linguistics , philosophy , archaeology , computer vision , history
Current and future architectures rely on thread-level parallelism to sustain performance growth. These architectures have introduced a complex memory hierarchy, consisting of several cores organized hierarchically with multiple cache levels and NUMA nodes. These memory hierarchies can have an impact on the performance and energy efficiency of parallel applications as the importance of memory access locality is increased. In order to improve locality, the analysis of the memory access behavior of parallel applications is critical for mapping threads and data. Nevertheless, most previous work relies on indirect information about the memory accesses, or does not combine thread and data mapping, resulting in less accurate mappings. In this paper, we propose the Sharing-Aware Memory Management Unit (SAMMU), an extension to the memory management unit that allows it to detect the memory access behavior in hardware. With this information, the operating system can perform online mapping without any previous knowledge about the behavior of the application. In the evaluation with a wide range of parallel applications (NAS Parallel Benchmarks and PARSEC Benchmark Suite), performance was improved by up to 35.7% (10.0% on average) and energy efficiency was improved by up to 11.9% (4.1% on average). These improvements happened due to a substantial reduction of cache misses and interconnection traffic.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom