When software is compiled, it is converted from a high level “human readable” set of statements to a set of low level “machine readable” instructions. The control flow of the machine readable instructions can be very much like that of the human readable statements, or can be very different. During compilation, software can be “optimized” to increase the speed with which the final machine readable code executes.
Optimizing compilers can benefit from profiling information that is collected when the program runs. Typical profiling operations collect information such as: block profiles, which represent the frequency of execution of blocks of code; edge profiles, which represent the frequency of taken branches in and out of blocks of code; and path profiles, which represent sequences of taken edges. Path profiles differ from block and edge profiles in that path profiles represent the execution frequency of control flow paths traversed in the software program, whereas block and edge profiles represent the frequency of execution of smaller elements within the software program.
A path profiling method hereinafter referred to as the “BL method” is presented in: Thomas Ball & James Larus, “Efficient Path Profiling,” MICRO-29, December 1996. The BL method generates a path profile for each “plain path” in a software function. For the purposes of this description, a plain path is a path that starts at a function entry or a loop entry (if the loop enters from a back edge), and ends at a function exit or a loop exit (if a back edge is taken at the exit node). FIG. 1A shows a control flow graph (CFG) 10, and FIG. 1B shows the paths within CFG 10 that are profiled when the BL method is used. The profiled paths include paths 20 from the function entry (node A) to the function exit (node 1), paths 30 from the function entry (node A) to a loop exit (node E), paths 40 from a loop entry (node B) to a loop exit (node E), and paths 50 from a loop entry (node B) to the function exit (node I).
The term “region” as used herein, refers to a sub-graph of a control flow graph. The regions are either nested or disjoint. Region-based compilers may treat the hammock made up of nodes B, C, D, and E as an inner region, shown as region 1 (R1) in FIG. 1A. Similarly, the hammock made up of nodes F, G, H, and I, shown as region 2 (R2) in FIG. 1A, may also be treated as a region by a region-based compiler. Region-based compilers may benefit from path profile information for the separate regions in part because local optimizations can be made within regions. CFG 10 can be viewed as a hierarchical CFG in that regions 1 and 2 are inner regions, and all of CFG 10 is an outer region with the inner regions represented as single nodes in the graph.
“Hierarchical path profiles” are path profiles that include separate profiles for paths within inner regions, and for paths in outer regions. Paths in outer regions have inner regions represented as single nodes. For example, a hierarchical path profile of CFG 10 would include profiles for paths FGI and FHI, which are paths within inner region 2, and would also include profiles for AR2 and AR1R2, which are outer region paths having inner regions represented as single nodes. The BL method does not generate hierarchical path profiles. For example, the BL method does not generate separate path profiles for paths FGI, FHI, AR2, and AR1R2 that include the total execution counts for the specific paths.
For the reasons stated above, and for other reasons stated below which will become apparent to those skilled in the art upon reading and understanding the present specification, there is a need in the art for a method and apparatus for profiling hierarchical software paths.