The present invention relates a technique for obtaining execution frequency information on execution paths in a control flow graph.
A control flow graph is a graph in which all paths each having a possibility of being passed when a program is executed are expressed and the flow of control of the program is expressed. In this graph, a node represents a basic block (that is, no branch and no confluence at any intermediate points), and a directed edge making a node-to-node connection denotes a transition from one basic block to another basic block.
A basic block is a sequence of processing (statements or instructions) in a program, having no branch and no confluence at any intermediate points. Statements in a basic block are executed straight from the first to the last. In general, an entry block and an exit block exist as an entrance and an exit, respectively, of the entire graph described above.
In a control flow graph, a state where a directed edge is drawn from a basic block X toward a basic block Y is expressed as X→Y. X in this expression is referred to as a predecessor basic block or a predecessor node, and Y as a successor basic block or a successor node. Also, a node following multiple predecessor nodes is referred to as a merge node, and a node followed by multiple successor nodes is referred to as a branch node.
Control flow graphs are generally used in compiler optimization and static code analysis tools.
Profile information on a control flow graph, e.g., execution frequency information on directed edges and execution paths enables improving the effect of complier optimization. Profiling of execution frequency information on directed edges and execution paths in a control flow graph requires insertion of an instrumentation code for measurement of the execution frequency on each directed edge. As this instrumentation code, a path value is given to each edge. Giving such a path value to each edge eliminates the need for counting the edge every time; the sum of path values is collected as execution frequency information by summing up path values each assigned to the edge when the edge is passed by execution of the program. However, insertion of the above-described instrumentation code on each of the above-described directed edges increases the overhead, resulting in the degradation of runtime performance.
Non-patent Literatures 1 and 2 shown below describe techniques for profiling the frequencies of executions of execution paths.
Non-patent Literature 1 describes efficient path profiling.
Non-patent Literature 2 describes continuous path and edge profiling.
[Non-patent Literature 1] Thomas Ball et. Al., “Efficient Path Profiling”, International Symposium on Microarchitecture (MICRO'96), IEEE
[Non-patent Literature 2] Michael D. Bond et Al., “Continuous Path and Edge Profiling”, International Symposium on Microarchitecture (MICRO'05), IEEE
FIGS. 7A to 7D are schematic diagrams showing a state where a path value 1 is assigned to an edge between a preceding basic block (predecessor basic block) v and a subsequent basic block (successor basic block) w or w1 and an instrumentation code is inserted in control flow graphs (701, 711, 721, and 731) before modification respectively shown in FIGS. 7A to 7D. Descriptions will be made of how the control flow graphs (701, 711, 721, and 731) respectively shown in FIGS. 7A to 7D are modified by insertion of an instrumentation code (i.e., an instruction to add a path value). The path value is an integer value uniquely representing an execution path passed through the control flow graph from a starting point to an end point.
Referring to FIG. 7A (Prior Art 1), the control flow graph (701) before modification has a predecessor basic block v and two successor basic blocks following the predecessor basic block v: a successor basic block x (which is a predecessor basic block precedent to another successor basic block w, and which is also referred to as predecessor basic block x) and a successor basic block w. The predecessor basic block v is connected to the successor basic block x by an edge v→x and to the successor basic block w by an edge v→w. The successor basic block x is connected to the successor basic block w by an edge x→w.
A computer assigns a path value 0 to the edge v→x between the predecessor basic block v and the successor basic block x in the control flow graph (701) before modification, and assigns a path value 1 to the edge v→w between the predecessor basic block v and the successor basic block w. The computer also assigns a path value 0 to the other edge. Accordingly, a control flow graph (702) has the edge v→x assigned the path value 0 and has the edge v→w assigned the path value 1.
Next, the computer performs an operation to insert the instrumentation code in the control flow graph (702).
Since the path value 0 is assigned to the edge v→x, the computer inserts no instrumentation code thereon. The computer also inserts no instrumentation code with respect to the other edge assigned the path value 0.
Since the path value 1 is assigned to the edge v→w, the computer inserts on the edge v→w a basic block (704) including an instruction to add 1 as a path value (r+=1). The inserted basic block (704) includes a jump instruction (jmp w) to make a jump to the successor basic block w as well as the instruction to add 1 as a path value (r+=1).
In a modified control flow graph (703) shown in FIG. 7A, the number of jump instructions is increased by 1 (jmp w) (704) as a result of insertion of the instrumentation code. A problem thus arises that the overhead is increased.
Referring to FIG. 7B (Prior Art 2), the control flow graph (711) before modification has a predecessor basic block v and two successor basic blocks following the predecessor basic block v: a successor basic block x (which is a predecessor basic block precedent to another successor basic block w, and which is also referred to as predecessor basic block x), a successor basic block w, and a predecessor basic block y other than the above-mentioned predecessor basic block v. The predecessor basic block v is connected to the successor basic block x by an edge v→x and to the successor basic block w by an edge v→w. The predecessor basic block y is connected to the successor basic block x by an edge y→x. The successor basic block x is connected to the successor basic block w by an edge x→w.
A computer assigns a path value 0 to the edge v→x between the predecessor basic block v and the successor basic block x in the control flow graph (711) before modification, assigns a path value 1 to the edge v→w between the predecessor basic block v and the successor basic block w, and assigns a path value 2 to the edge y→x between the predecessor basic block y and the successor basic block x. The computer also assigns a path value 0 to the other edge. Accordingly, a control flow graph (712) has the edge v→x assigned the path value 0, has the edge v→w assigned the path value 1 and has the edge y→x assigned the path value 2.
Next, the computer performs an operation to insert the instrumentation code in the control flow graph (712).
Since the path value 0 is assigned to the edge v→x, the computer inserts no instrumentation code thereon. The computer also inserts no instrumentation code with respect to the other edge assigned the path value 0.
Since the path value 1 is assigned to the edge v→w, the computer inserts on the edge v→w a basic block (714) including an instruction to add 1 as a path value (r+=1). The inserted basic block (714) includes a jump instruction (jmp w) to make a jump to the successor basic block w as well as the instruction to add 1 as a path value (r+=1).
Similarly, since the path value 2 is assigned to the edge y→x, the computer inserts on the edge y→x a basic block (715) including an instruction to add 2 as a path value (r+=2). The inserted basic block (715) includes a jump instruction (jmp x) to make a jump to the successor basic block x as well as the instruction to add 2 as a path value (r+=2).
In a modified control flow graph (713) shown in FIG. 7B, the number of jump instructions is increased by 2 (jmp w and jmp x) (714 and 715, respectively) as a result of insertion of the instrumentation code. A problem thus arises that the overhead is increased.
Referring to FIG. 7C (Prior Art 3), the control flow graph (721) before modification has a predecessor basic block v and three successor basic blocks following the predecessor basic block v: a successor basic block x (which is a predecessor basic block precedent to other successor basic blocks w1 and w2, and which is also referred to as predecessor basic block x); successor basic blocks w1 and w2; and a predecessor basic block y other than the predecessor basic block v (which is a predecessor basic block precedent to the successor basic block w2). The predecessor basic block v is connected to the successor basic block x by an edge v→x, to the successor basic block w1 by an edge v→w1, and to the successor basic block w2 by an edge v→w2. The successor basic block x is connected to the successor basic block w1 by an edge x→w1. The successor basic block w1 is connected to the predecessor basic block y by an edge w1→y. The predecessor basic block y is connected to the successor basic block w2 by an edge y→w2.
A computer assigns a path value 0 to the edge v→x between the predecessor basic block v and the successor basic block x in the control flow graph (721) before modification, assigns a path value 1 to the edge v→w1 between the predecessor basic block v and the successor basic block w1, and assigns a path value 2 to the edge v→w2 between the predecessor basic block v and the successor basic block w2. The computer also assigns a path value 0 to the other edges. Accordingly, a control flow graph (722) has the edge v→x assigned the path value 0, has the edge v→w1 assigned the path value 1, and has the edge v→w2 assigned the path value 2.
Next, the computer performs an operation to insert the instrumentation code in the control flow graph (722).
Since the path value 0 is assigned to the edge v→x, the computer inserts no instrumentation code thereon. The computer also inserts no instrumentation code with respect to the other edges assigned the path value 0.
Since the path value 1 is assigned to the edge v→w1, the computer inserts on the edge v→w1 a basic block (724) including an instruction to add 1 as a path value (r+=1). The inserted basic block (724) includes a jump instruction (jmp w1) to make a jump to the successor basic block w1 as well as the instruction to add 1 as a path value (r+=1).
Similarly, since the path value 2 is assigned to the edge v→w2, the computer inserts on the edge v→w2 a basic block (725) including an instruction to add 2 as a path value (r+=2). The inserted basic block (725) includes a jump instruction (jmp w2) to make a jump to the successor basic block w2 as well as the instruction to add 2 as a path value (r+=2).
In a modified control flow graph (723) shown in FIG. 7C, the number of jump instructions is increased by 2 (jmp w1 and jmp w2) (724 and 725, respectively) as a result of insertion of the instrumentation code. A problem thus arises that the overhead is increased.
Referring to FIG. 7D (Prior Art 4), the control flow graph (731) before modification has a predecessor basic block v and three successor basic blocks following the predecessor basic block v: a successor basic block x (which is a predecessor basic block precedent to another successor basic block w, and which is also referred to as predecessor basic block x), successor basic blocks w1 and w2, a predecessor basic block z (which is a predecessor basic block precedent to the successor basic block w2) and a predecessor basic block y (which is a predecessor basic block precedent to the successor basic block x). The predecessor basic block v is connected to the successor basic block x by an edge v→x, to the successor basic block w1 by an edge v→w1, and to the successor basic block w2 by an edge v→w2. The predecessor basic block y is connected to the successor basic block x by an edge y→x. The successor basic block x is connected to the successor basic block w1 by an edge x→w1. The successor basic block w1 is connected to the predecessor basic block z by an edge w1→z. The predecessor basic block z is connected to the successor basic block w2 by an edge z→w2.
A computer assigns a path value 0 to the edge v→x between the predecessor basic block v and the successor basic block x in the control flow graph (731) before modification, assigns a path value 1 to the edge v→w1 between the predecessor basic block v and the successor basic block w1, and assigns a path value 2 to the edge v→w2 between the predecessor basic block v and the successor basic block w2. The computer also assigns a path value 0 to the other edges. Accordingly, a control flow graph (732) has the edge v→x assigned the path value 0, has the edge v→w1 assigned the path value 1, and has the edge v→w2 assigned the path value 2.
Next, the computer performs an operation to insert the instrumentation code in the control flow graph (732).
Since the path value 0 is assigned to the edge v→x, the computer inserts no instrumentation code thereon. The computer also inserts no instrumentation code with respect to the other edges assigned the path value 0.
Since the path value 1 is assigned to the edge v→w1, the computer inserts on the edge v→w1 a basic block (734) including an instruction to add 1 as a path value (r+=1). The inserted basic block (734) includes a jump instruction (jmp w1) to make a jump to the successor basic block w1 as well as the instruction to add 1 as a path value (r+=1).
Similarly, since the path value 2 is assigned to the edge v→w2, the computer inserts on the edge v→w2 a basic block (735) including an instruction to add 2 as a path value (r+=2). The inserted basic block (735) includes a jump instruction (jmp w2) to make a jump to the successor basic block w2 as well as the instruction to add 2 as a path value (r+=2).
Similarly, since the path value 3 is assigned to the edge y→x, the computer inserts on the edge y→x a basic block (736) including an instruction to add 3 as a path value (r+=3). The inserted basic block (736) includes a jump instruction (jmp x) to make a jump to the successor basic block x as well as the instruction to add 3 as a path value (r+=3).
In a modified control flow graph (733) shown in FIG. 7D, the number of jump instructions is increased by 3 (jmp w1, jmp w2, and jmp x) (734, 735, and 736, respectively) as a result of insertion of the instrumentation code. A problem thus arises that the overhead is increased.
As described above, the method shown as prior art entails the problem that as a result of insertion of the instrumentation code for collecting profile information, the number of jump instructions is increased and the overhead is increased.
In the method described in Non-patent Literature 1, an instrumentation code is placed on directed edges to calculate, only by addition and subtraction, integer values (path values) representing execution paths passed from a start point to an end point of a control flow graph. In the method described in Non-patent Literature 1, the placement of the instrumentation code is optimized by obtaining a maximum-cost spanning tree of the control flow graph and inserting the instrumentation code on a directed edge not included in the maximum-cost spanning tree. In the method described in Non-patent Literature 1, however, the overhead is large because path values are recorded every time in a memory at the end point of the control flow graph.
In the method described in Non-patent Literature 2, an instrumentation code on edges with high execution frequencies is removed by using execution frequency information obtained by different approaches when the instrumentation code is placed by using the method described in Non-patent Literature 1. However, since the method described in Non-patent Literature 2 itself is a method of obtaining the execution frequency, there is a possibility of the execution frequency information being not usable or seriously low in accuracy. Also, in the method described in Non-patent Literature 2, the above-described path value is sampled at certain time intervals instead of being recorded in a memory every time. In the method described in Non-patent Literature 2, therefore, the overhead due to the instrumentation code becomes dominant as a result of largely reducing the overhead for recording to the memory and, in particular, the overhead in the case where insertion of the instrumentation code is accompanied by insertion of jump instructions is considerable.
An object of the present invention is to reduce the above-described overhead by minimizing the number of necessary jump instructions at the time of insertion of an instrumentation code for collection of profile information (for example, for calculation of path values in each of the methods described in Non-patent Literatures 1 and 2).
Non-patent Literatures 1 and 2 are incorporated herein by reference.