When software is compiled, it is converted from a high level “human readable” set of statements to a set of low level “machine readable” or binary instructions. The binary instructions can then be executed in a runtime environment to perform a particular function defined by the human readable set of statements. During execution, however, problems in efficiency, performance, and/or other coding errors may occur. Accordingly, analyzing a computer program's behavior during such runtime environment provides many benefits to both developers and end users by collecting information relevant to problems and/or to optimization. For instance, analysis has proved valuable in a wide variety of areas such as computer architecture, compilers, debugging, program testing, and software maintenance.
In order to monitor the behavior of such runtime execution, profilers have been developed, which provide tools that measure the behavior of a program as it runs, particularly the frequency and duration of function calls. The output is a stream of recorded events (a trace) or summaries of the events observed (a profile). The performance or other data gathered can then be used to determine such things as which source code might benefit most from improved or optimized coding. For example, if a particular function is called within a program loop and the loop is a “hot spot” during execution, it may be desirable to program the function in-line within the loop—rather than as a function call.
Typical profiling operations collect information such as: block profiles, which represent the frequency of execution of blocks of code; edge profiles, which represent the frequency of taken branches in and out of blocks of code; path profiles, which represent sequences of taken edges; and complete profiles, which record all instructions executed within the compiled code. Path profiles provide many advantages over basic block, edge, and even complete trace profilers. For example, path profiles capture much more control-flow information than basic block or edge profiles; yet they are much smaller than complete instruction traces. Further, several compiler optimizations perform better when trade-offs are driven by information gained from accurate path profiles. Program paths are also a more credible way of measuring coverage of a given test suite. In addition, abstractions of paths can help automatic test generation tools create more robust test cases; and program path histories often serve as a valuable debugging aid by revealing an instruction sequence which may have executed in the lead up to interesting program points.
Unfortunately, the benefits of using path profiles come at a significant cost—i.e., path profiling is expensive in measures of time and/or computational resources. For example, the number of potential paths within a program or a procedure can be, and often is, arbitrarily large. In order to deal with such arbitrary size problem, path profilers use hash tables to identify paths, analyze paths, and store path profile information. Although hash tables provide a convenient and effective mechanism for storing and accessing the path profile information, the overhead for using hashes is still high—as much as 50% of an average execution time. Accordingly, despite the known benefits of performing path profiling, the high overhead of analyzing path profiles has limited the use of path profiles in favor of basic block or edge profiles.
As mentioned above, however, while basic block and edge profiles are less expensive to collect, they do not accurately capture a program's dynamic behavior as compared to path profiles. In many cases, the most complex and most interesting paths are not predictable from information and analysis of a block or edge profile.