Simulation study of memory performance of SMP multiprocessors running a TPC-W workload
Author(s) -
Pierfrancesco Foglia,
Roberto Giorgi,
Cosimo Antonio Prete
Publication year - 2004
Publication title -
iee proceedings - computers and digital techniques
Language(s) - English
Resource type - Journals
eISSN - 1359-7027
pISSN - 1350-2387
DOI - 10.1049/ip-cdt:20040349
Subject(s) - computer science , scalability , cache coherence , multiprocessing , cache only memory architecture , parallel computing , shared memory , overhead (engineering) , cache , mesi protocol , operating system , cpu cache , cache algorithms , cache coloring
The infrastructure to support Electronic Commerce is one of the areas where more processing power is currently needed. A multiprocessor system can offer advantages for running Electronic Commerce applications. DAIn this paper C the memory performance of an Electronic Commerce server C i.e. a system running Electronic Commerce applications C is evaluated in the case of shared-bus multiprocessor architecture. The software architecture of this server is based on a three-tier model and the workloads have been setup as specified by the TPC-W benchmark. The hardware configurations are A i 9 a single SMP running tiers two and three and ii 9 two SMPs each one running a single tier.DAWe analyze the influence of memory subsystem on performance and scalability and we consider several solutions aimed at reducing the latency of memory. After initial experiments C which validate our methodology C we explored different choices as for cache C scheduling algorithm C and coherence protocol to enhance performance and scalability. DAAs in previous studies on shared-bus multiprocessors C we found that the memory performance is highly influenced by cache parameters. While scaling the machine C the coherence overhead weighs more and more on the memory performance. False sharing in the kernel is among the main causes of this overhead.DAUnlike previous studies C we were able to highlight that passive sharing - the useless sharing of the private data of the migrating processes - is another important factor that influences the performance. This is especially true when multiprocessors with a higher number of processors are considered A an increase in the number of processors produces real benefits C only if advanced techniques for reducing the coherence overhead are properly adopted.DAScheduling techniques limiting the process migration may reduce passive sharing C while restructuring techniques of the kernel data may reduce false sharing misses. However C even when process migration is reduced through cache-affinity techniques C standard coherence protocols like MESI protocol don 2 0 9t allow for the best performance. Coherence protocols like PSCR and AMSD produce performance benefits. PSCR C in particular C eliminates coherence overhead due to passive sharing and minimizes the number of coherence misses. The adoption of PSCR and cache-affinity scheduling allows us to extend the multiprocessor scalability up to 20 processors for a 128-bit shared-bus and current values of main memory to processor speed gap
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom