1. Field of the Invention
This invention generally relates to the field of system characterization, and more particularly to CPU (Central Processing Unit) profiling and function call tracing for a target application to enable the identification of program bottlenecks, which cause slow performance.
2. Description of the Related Art
In spite of very fast computer hardware, such as a PowerParallel™ enterprise, and mature operating systems, such as AIX, a given target application's execution performance can be less than optimal. Applying profiling software to the target application on a given enterprise provides clues to answer the question: How can the target application be made to execute faster?
Profiling software is used to identify which portions of the target application are executed most frequently or where most of the time is spent. Profilers are typically used after a basic tool, such as vmstat or iostat commands point out a CPU bottleneck, which is causing slow performance. Tools such as vmstat and iostat report CPU and I/O statistics for the entire system. Using a predetermined benchmark, the profiler analyzes the target application to determine the place or places of the bottlenecks, which result in slow execution. Typically once these bottlenecks of CPU usage or function calls are determined, programming or reprogramming can be employed to reduce the bottleneck or in some cases eliminate it from the target application. These profiling tools, although useful have certain shortcomings. One shortcoming is that profiling tools require the source code of the target application. Many times the source code may not be available to the person running the profiling tests. It is not uncommon for source code to be treated as confidential. The person or entity profiling the software may not be the same entity that wrote the software. Accordingly, a need exists to overcome this problem of requiring the target application source code for profiling.
FIG. 1 is flow diagram 100, which illustrates a trace study flow of currently available prior art profiling and performance management tools. The flow is entered at step 102 when a need is identified for a study of a target application. This entails looking for any bottlenecks, such as waiting for an I/O resource and or the identification of any hot spots such as using a particular subroutine in the application. Step 104 identifies the intended focus of the trace that will be run, such as questioning why there is so much I/O activity. The target application's source code is determined to be available at step 106. If the target application's source code is not available, the flow is exited at step 116 and the trace study is abandoned. Given that the source code for the target application is available, one or more source files is recompiled with the “-pg” option. The intention here is to focus in on an area of the target application and determine if the activity makes sense. This is shown as step 108. The application is relinked with the -pg flag, as shown in step 110. The target application is now run at step 112, typically with a standard setup and benchmark so that over several runs the resultant trace data can be used for comparison between the different runs. As the target executes, the -pg flagged information is put into a gmon.out file at step 114. This output file is studied both directly and with certain standard profiling tools, such as gprof or IBM's Xprofiler. If the study is considered to be complete, at step 116 the flow is exited at step 118. If the study is not complete at step 116 then the -pg flag is reassigned to different points on the target application's source code at step 108 and the recompiling, relinking, run trace 112 and analyze the results 114 loop is repeated until the multiple trace runs provides sufficient information for the study to be considered completed.
It is noted that without the source code the profiling study cannot be made. In addition each time a new -pg flag assignment is made the target application must be recompiled and relinked. This recompiling step is time consuming and inhibits the spontaneous “what-if” workflow. It is difficult to just trace part of the target application that is, just 10% of the functions. For example, just 10% of the functions, or 10% of the execution time in a target application. Accordingly, a need exists to overcome these shortcomings and to provide a set of improved profiling tools to run traces with certain diagnostic tools and software probes that allow for optimizing of target applications.
Another shortcoming with the prior art profiling tools is the requirement that any changes to the profiling benchmarks cannot be made once the target application has started. Many times application developers want to examine applications from several perspectives with out being required to re-start the program execution. Accordingly, a need exists to enable changes in the benchmarking tools after the target application has started execution.
Still another shortcoming of the performance and profiling tools available today is the requirement to recompile and/or relink the target application every time the performance and managing tool is used. Typically a -pg flag must be used in the Unix environment. The need to recompile and/or relink the source code with special debugging flags many times restricts the user from making timely or spontaneous changes to the application. Each time the -pg flag is changed the application must be recompiled and relinked. Accordingly, a need exists to provide a solution to overcome this shortcoming as well.
Yet another shortcoming with the prior art performance profiling tools is how the results of a function trace are reported. Today, each function in a file compiled with -pg will have a corresponding entry in the gmon.out file. Since the choice of what to profile can only be done at the file level, this could potentially leas to a lot of unwanted data.
The trace output file in format of a gmon.out file does have a set of tools that are used to further identify and understand the location of the bottlenecks. It is desirable for any new and improved trace characterization technique to output the results in the gmon.out file format, which is familiar to the user and allows for continued usage of the characterization tools.
Accordingly, a need exists for a trace characterization technique that will not only eliminate all of the shortcomings listed above but also maintain compatibility with existing output and analysis tool formats.