1. Technical Field
The present invention relates to both a method and an apparatus for profiling the effectiveness of competitive benchmark tests. Extra delays are placed into the benchmark test to determine the impact on the benchmark""s score for the code segment under test.
2. Description of Related Art
Application developers and operating system vendors use benchmark programs to measure the performance of their applications and systems. Benchmark programs are typically written to exercise a specific aspect of the application or systemxe2x80x94for example, a file system benchmark might execute a large number of file opens, reads, writes, and closes. With each benchmark program a particular metric of interest is associated; it might be the time it takes for the benchmark to complete (e.g., time to execute 1000 file opens), or be the number of operations that can be completed in a given, fixed time (e.g., number of file opens in 30 seconds). This metric is typically called the score. For the examples above, the first example reports the benchmark score as a response time, while the latter reports the benchmark score as an indicator of throughput. Irrespective of how the score is defined for a particular benchmark, the score is the single indicator typically used to compare one execution of the benchmark with another. In this way, a ranking of a set of application programs (or operating systems) can be readily obtained with respect to a specific benchmark. Most often, benchmark scores are some function of time. However, the mechanism that translates time spent in certain functions, or during certain phases of benchmark execution into the score may not be immediately apparent. Indeed, it may be hidden. Instead of equating benchmark execution time with score, the benchmark program might employ some internal algorithm to determine the score. These algorithms are not generally made availablexe2x80x94nor is the source codexe2x80x94so that it becomes a challenge to determine which functions in the benchmark are relevant to the score. It is important for application developers and operating system vendors to understand which parts of the benchmark are relevant, because they are in a competitive position to obtain better scores than competing developers and vendors. Only by tuning the application or system in areas that directly affect the resulting score can these developers and vendors most efficiently deliver competitive implementations.
In these cases, a technique is needed to determine the relevant components of the benchmark. This application describes such a technique that is based on the injection of some specific delay at various identifiable places within the benchmark program and observes the effect that this has on the resulting benchmark score. Once those areas of the benchmark are identified, then we can employ traditional profiling techniques (e.g., program counter sampling techniques) to focus on identifying optimization opportunities for those code segments.
In other cases, the nature of the computation used to arrive at the score itself may not be known. For example, it may not be clear whether the score measures throughput in a fixed amount of time (e.g., the number of operations completed in 60 seconds) or the time required to complete a fixed computation. The invention described herein may be used to help answer these kind of questions.
The present invention relates to a method and apparatus for methodically altering a benchmark test. A copy of the benchmark is obtained. It is manipulated using binary editing capabilities and a specific unit of additional work (e.g. a unit of delay) is injected into the benchmark. The benchmark proceeds otherwise unchanged and the score is obtained. As mentioned above, many benchmark vendors explicitly report this score and software manufacturers tune their offerings to improve their score. The process is repeated with the unit of delay injected into a different location. Multiple delays can be injected at different locations, or even at the same location to produce a more complete sensitivity map. Following some number of executions, a sensitivity map is developed indicating the impact on the score for a set of injection sites. The resulting map provides an insight into areas of best opportunity for deeper analysis and subsequent improvements.
For example, by repeatedly injecting a unit of delay into each method in a Java application, an indication is obtained on which area of the application are most sensitive to performance. We can then apply more traditional techniques toward the profiling of the underlying middleware/system behavior while the specific activity driven by that method is active.
Furthermore, once the significance of a particular method has been assured, one can further probe the structure of the benchmark by varying the magnitude of delay injected into that method and obtaining the resulting scores. The resulting sensitivity map can provide insight into the role the method plays in score determination, and simultaneously provide a guide to the opportunity for benchmark score improvement.