Premium
Differential profiling
Author(s) -
McKenney Paul E.
Publication year - 1999
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/(sici)1097-024x(199903)29:3<219::aid-spe230>3.0.co;2-0
Subject(s) - bottleneck , profiling (computer programming) , computer science , software , macro , distributed computing , software deployment , computation , implementation , parallel computing , embedded system , algorithm , operating system , programming language
Performance can be a critical aspect of software quality; in some systems, poor performance can cause financial loss, physical damage, or even death. In such cases, it is imperative to identify system performance problems before deployment, preferably well before implementation. Unfortunately, the size of most software systems grossly exceeds the capacity of current performance‐modelling techniques. Hence, there is a need for techniques to quickly identify the portions of the system that are performance‐critical. These portions are often small enough to be modelled directly. This paper describes one such technique, differential profiling. Differential profiling combines two or more conventional profiles of a given program run in different situations or conditions. The technique mathematically combines corresponding buckets of the conventional profiles, then sorts the resulting list by these combined values. Different combining functions are suitable for different situations. This combining of conventional profiles frequently yields much greater insight than could be obtained from either of the conventional profiles. Hence, differential profiling helps to locate difficult‐to‐find performance bottlenecks, such as those that are distributed widely throughout a large program or system, perhaps by being concealed within macros or inlined functions. This paper also describes how this technique may be used to pinpoint certain types of performance bottlenecks in large programs running on large‐scale shared‐memory multiprocessors. In this environment, the critical bottleneck might consume only a small fraction of the total CPU time, since typical critical sections can consume at most one CPUs worth of computation. This sort of bottleneck, particularly when widely distributed throughout the program under consideration, is often invisible to traditional profiling techniques. Copyright © 1999 John Wiley & Sons, Ltd.