In general, software profiling is a technique for measuring or estimating what parts of a complex hardware and software system are consuming the most computing resources. The most common profiling tools aim to determine which segments of code within an application or service are consuming the most processor time and to find performance “bottlenecks” where optimization can be most beneficial to the running time. Profiling can also be applied to the consumption of other resources, such as processor caches, operating system APIs, memory, and I/O devices.
The two most common approaches used in processor-time profiling are sampling and “per-occurrence” measurement. Sampling involves choosing a subset of interesting events, determining the cause of those events, and reporting the frequency of those causes. For example processor-time sampling involves measuring, at regular time intervals, which code was running; such as noting, at regularly-spaced times, the value in the processor's instruction pointer register.
FIG. 1(a) depicts an example of sampling. FIG. 1(a) is a time profile of a notional computer program where functions within the computer program FA, FB, FC are being executed at various times according to needs of the program. In FIG. 1(a), regularly time-spaced sampling intervals (SP) are indicated where a profiler function would sample the system operation to determine its state at the time of sampling. Typically, such sampling would involve reading a program counter or instruction pointer to determine the subroutine being run representing each function. The frequency at which the program counter or instruction pointer was a particular value provides an estimate of how often the processor was executing that instruction or subroutine. Or the samples for all instructions within a function can be grouped together to produce a report of top time-consuming functions. Sampling at intervals other than time can provide an indication of consumers of other resources. For example, if it is possible to sample every “Nth” cache miss, then you could estimate the functions which produce the highest number of cache misses.
“Per-occurrence” measurement is done every time a particular event occurs. The main forms of this measurement are counting the number of times an event occurs, or querying the time at the beginning and end of a work interval and subtracting to find the amount of time taken to perform that work. The “instrumentation” to count the event or to measure the interval may be added to the code manually, or may be built-in to the code by a compilation tool. FIG. 1(b) is an example of “per-occurrence” measurement in a software time profile where instrumentation is used to determine system events. Each test point (TP) in FIG. 1(b) represents the beginning or end of an event, such as the beginning or end of a function within the software run profile.
The two techniques of sampling and per-occurrence profiling both have advantages and disadvantages. The per-occurrence measurement cannot be performed for code which does not contain any instrumentation. Also, duration timing measures nearly-exact running time, but the measurement itself can skew results by affecting the duration. For example, the work required to read the time on entry and exit to a function is much larger in proportion to the run-time of small functions than it is in proportion to the run-time of large functions. Per-occurrence measurement can also produce a very large amount of data if the occurrences happen very often. For example, logging the entry and exit of every function in an application shown in FIG. 1(b) will add up quickly.
On the other hand, sampling, as in FIG. 1(a) typically produces much less data. Rather than capturing every moment in time, time-based sampling is scaled back to sample at a relatively low frequency. In fact, the sampling frequency can be adjusted to suit the situation; sampling too often produces too much data, while sampling too infrequently leads to inaccuracy of measurement. For example, an ill-spaced sampling interval as in FIG. 1(a) would miss two of the three occurrences of function FB. That inaccuracy is the disadvantage of sampling; since samples only provide you with captures for a very small portion of the overall whole, the result is only an estimate rather than an exact measurement. Chance sampling within functions that occur very rarely can make those functions appear to take a higher proportion of time than they actually do appear. Sampling is also susceptible to errors related to events that occur at the sampling interval. For example, if a profiler is sampling the processor instruction pointer once every 100 milliseconds, and some other event is occurring once every 100 milliseconds, then the profiler could possibly miss every instance of that event, or it could possibly hit every instance of that event, making it appear as though the event-handling code was running 0% or 100% of the time, respectively, when in truth the event-handling code would be running at some intermediate level between the extremes.
An ideal system would gather all of the data available and then process the data without affecting the run of the application. However, data memory and processing time are normally limited, so a better approach would be to take a minimum amount of data to gain a maximum amount of insight as to how a system was behaving during run time. However, that minimum amount of data is difficult to predict and instrument. Thus, there is a need for a technique which can perform a variety of profiling functions in a time efficient manner, gathering a reasonable amount of data, and produce results without greatly affecting the run time performance of the system under test. The present invention addresses the aforementioned needs and solves them with additional advantages as expressed herein.