Software programmers typically profile applications during application development to determine how the various parts of the application are used during execution. With such information, execution of the application can be, for example, optimized.
Profiling typically involves collecting statistical information that reveals how the application executes. This information can include identification of which functions or parts of the application were used for a given input (i.e., coverage), the time spent executing each particular function or part, the number of times each particular function was called, the number of times each function called another function, the amount of instruction-level parallelism that is found in the executed code, etc.
The most common method used to profile applications involves statistical sampling of the application and compiler generated instrumentation. Instrumentations are inserted into the application code, typically during compilation of the source code. These instrumentations often comprise function calls that are inserted at the beginning of each application function by the compiler. Each of the instrumentations gathers information about the application function to which it is associated such that various statistical information can be obtained about the execution of each function of the application. Additionally, the caller/callee relationship between functions (or parts) for a given run with a given input is reconstructed by means of the instrumentation code inserted. This is known as the “call graph”.
In addition to the information gathered by the instrumentations, the amount of time that is spent in any given application function (or part, e.g. a “basic block”) typically is determined by linking the application with run-time support code that instantiates a timer that is used to periodically (e.g., 100 times per second) record the values of the application program counter. From the periodically sampled values, an approximation of the amount of time spent executing each application function can be determined.
The information collected by the various instrumentations and program counter sampling (known as a “program counter histogram”) normally is analyzed by a profiling program that generates a user readable call graph, or a visual representation of it, which can be studied by the programmer to learn about the manner in which the application executes. The call graph normally comprises a series of nodes that represent the various application functions, and a series of arcs that connect the nodes and represent the various associations between the application functions. In addition, the call graph can include annotations that provide various information such as the amount of time spent within a particular application function, the number of times a function is invoked, the number of times a function invokes another function, etc.
Although the above-described method of profiling is simple to implement, it includes significant drawbacks. First, in that periodic sampling is used to collect the statistics, the results obtained may not be very accurate. Accordingly, the generated program counter histogram may contain incorrect and/or imprecise, and therefore misleading, information regarding the way in which the application executes. Although the profiling accuracy can be increased by increasing the sampling rate, the overhead associated with this increased sampling can become substantial, for instance accounting for 50% or more of the execution time.
The accuracy of the information obtained through conventional profiling methods can further be decreased due to the very fact that profiling is conducted. For example, if a function is added by the compiler to an existing application function for the purpose of collecting information about its execution, but the added function requires as much or more time to execute than the application function (e.g., if the instrumented function is very short running), the collected information will indicate more time spent in executing the existing application function than would actually be required during normal operation. Additionally, inserting instrumentation probes in a application has a non-negligible impact on various aspects of compiling and executing a given application. For example, the code generated by a compiler after the instrumentation probes have been inserted into function, may be substantially less efficient than without the instrumentations. Furthermore, at run-time the instrumentation code execution can alter the behavior of the application by altering the state of the hardware (e.g. caches, TLBs, etc.).
Profiling accuracy is further reduced when shared libraries are used by the application. As is known in the art, code from shared libraries is only accessed at application run-time. Therefore, instrumentations added to the source code by the compiler will not collect any information about the libraries' code in that this code is separate from the application source code.
Furthermore, conventional profiling methods do not permit the programmer to limit the collection of information to information concerning only those code portions that are most frequently used because such information is not known beforehand at compilation time. As is known in the art, programmers typically are not concerned about the execution of functions where the time spent executing them is so minimal as to be nearly irrelevant. However, with conventional profiling techniques, each function (or other application code portion) is individually instrumented. Another common drawback of traditional profiling schemes is that profiling is restricted to predefined statistical measures as defined by the compiler toolchain or profiling tool used. Therefore, the programmer cannot define other quantities or measures to be done on the application that may be more meaningful to the programmer for a specific application.
From the foregoing, it can be appreciated that it would be desirable to have a system and method for application profiling that avoids one or more of the drawbacks identified above.