The present invention relates to a binary program conversion device and method for converting a binary program into another binary program and, more particularly, to a technique in which an instruction string in an unconverted binary program is updated into another instruction string to increase the execution speed of execution of a converted program.
As an invention related to conversion of a binary program of this type, an invention filed by the present applicant and disclosed in Japanese Patent Application Laid-Open No. 10-187460 or the like is known. In this prior art, when a converted binary program is executed by a computer having a cache memory, a device and a method for reconstituting a plurality of instruction blocks of an unconverted binary program are proposed to increase a hit rate of the cache memory.
In this invention, an unconverted binary program constituted by a plurality of instruction blocks is temporarily executed first. On the basis of information obtained when the program is executed, an executed instruction block is separated from an unexecuted program, and a plurality of instruction blocks are reconstituted such that executed programs continue. In this manner, a part which is actually executed in the converted program is localized to achieve effective usage of the cache memory. As a result, the hit rate of the cache memory is increased.
On the other hand, as a technique for increasing the speed of a computer, in addition to the usage of the cache memory, the following pipeline process is known. That is, the execution of one instruction is finely divided into a plurality of stages, and the stages are executed like an assembly-line operation, or a series of instructions which are continuously executed like an assembly-line operation. In this pipeline process, it is very important for increasing the speed of a computer that the order of processes of the assembly-line operation is prevented from being disturbed. In such a pipeline process, the flow of instructions is interrupted by a conditional branch instruction, so that the assembly-line operation may be disturbed. In this case, it is a problem that excessive time is spent to compose the flow of the pipeline again after the conditional branch instruction is executed. In order to prevent the pipeline from being disturbed, a function called a branch prediction is built in a recent computer. A branched instruction which is predicted in advance is preferentially filled in the pipeline, so as to reduce the disturbance.
As the branch prediction, two types of branch predictions, i.e., static branch prediction and dynamic branch prediction are known. The dynamic branch prediction is performed on the basis of the history of past branch directions. More specifically, the execution efficiency of a pipeline is improved by a prediction function included in a computer for executing an instruction depending on the instruction execution state.
On the other hand, in the static branch prediction, the direction of branch to be predicted (upper direction or lower direction of address) is predetermined in an execution program. The static branch prediction is used for predicting a branch destination in an initial state in which a history of branches has not been formed or a state in which a branch prediction buffer (memory) for storing a history of branches is overflowed not to use the branch history of the instruction.
As the static branch prediction, various static predictions performed by the CPU of a computer are known. For example, a branch prediction bit is included in a branch instruction, and the branch prediction bit can be set when a program is compiled, or the characteristics of the branch prediction are implicitly determined in advance. When the characteristics of the branch prediction are implicitly determined, if a binary program is not constituted such that the characteristics are utilized, not only the characteristics are not effectively used, but also execution performance may be degraded.
In the conventional technique described above, the processing rate of a computer is increased by increasing a hit rate of a cache memory. However, it is not considered that the hit rate is increased in consideration of the characteristics of the static branch prediction. Therefore, disturbance of a pipeline is not always effectively reduced by the static branch prediction.
The present invention has been made in consideration of the above problems of the conventional technique. More specifically, it is an object of the present invention to recognize the branch prediction characteristics of a computer for executing a program, to constitute a binary program corresponding to the characteristics to increase a hit rate of the prediction, to effectively prevent disturbance of a pipeline, and to increase the processing rate.
In order to solve the problem, the present invention employs the following means. In short, the present invention executes an unconverted program in advance, analyzes the execution characteristics of a branch instruction from the execution information, recognizes the characteristics of a static branch prediction of a computer, and constitutes a binary program such that the direction of the branch prediction coincides with the direction of an actual branch. In this manner, the present invention increases the hit rate of a branch prediction in execution of an instruction.
More specifically, the present invention is a program conversion device for converting a first binary program constituted by a plurality of instruction blocks into a second binary program executed by a computer having branch prediction means, including:
execution information storage storing execution information collected when the first binary program is executed in advance;
analyzer analyzing execution characteristics of a branch instruction between the plurality of instruction blocks in the first branch program from the execution information;
branch prediction characteristics storage storing branch prediction characteristics of the computer; and
converter updating a branch instruction between the plurality of instruction blocks in the first binary program on the basis of the execution characteristics of the branch instruction and the branch prediction characteristics such that a hit rate of the branch prediction is increased.
As the converter, the following process is preferably used. That is, when the computer has branch prediction characteristics for performing a rather low prediction of the execution probability of a branch with respect to a branch instruction to an address in an upper direction, a plurality of instruction blocks are reconstituted such that a branch instruction having a low frequency in the occurrence of a branch becomes a branch instruction to an address in the upper direction and that a branch instruction having a high frequency in the occurrence of a branch becomes a branch instruction to an address in a lower direction.
In this manner, a prediction that a branch is not performed to a branch instruction having a low frequency in the occurrence of an actual branch is hit. Then a pipeline is prevented from being disturbed, and the execution speed of a computer can be increased.
The converter may reconstitute, with respect to a branch instruction having a high frequency in the occurrence of a branch, a plurality of instruction blocks such that a portion corresponding to an instruction block before branch and a portion corresponding to an instruction block after branch continue.
The reconstitution of the plurality of instruction blocks of the binary program before branch performed such that the instruction block before branch and the instruction block after branch continue is to arrange a program as following.
More specifically, an unconverted binary program is arranged such that a plurality of instruction blocks which are connected by a branch instruction having a high frequency in the occurrence of a branch are linearly executed without any branch. The program is also arranged such that an instruction block which is not frequently executed is executed as a branch destination of a branch instruction.
In this manner, when instruction blocks which are probably continuously executed are continuously arranged, a pipeline is prevented from being disturbed, and the execution speed of the computer is increased.
In addition, as the converter, the following process is also preferably used. That is, when the computer has branch prediction characteristics for performing a rather low prediction of the execution probability of a branch with respect to a branch instruction to an address in a lower direction, a plurality of instruction blocks are reconstituted such that a branch instruction having a low frequency in the occurrence of a branch becomes a branch instruction to an address in the lower direction and that a branch instruction having a high frequency in the occurrence of a branch becomes a branch instruction to an address in an upper direction.
In this manner, a prediction that a branch is not performed to a branch instruction having a low frequency in the occurrence of an actual branch is hit. Then a pipeline is prevented from being disturbed, and the execution speed of a computer can be increased.
Furthermore, the binary program conversion device may reconstitute, with respect to a branch instruction having a high frequency in the occurrence of a branch, a plurality of instruction blocks such that a portion corresponding to an instruction block before branch and a portion corresponding to an instruction block after branch continue.
In this manner, when instruction blocks which are probably continuously executed are continuously arranged, a pipeline is prevented from being disturbed, and the execution speed of the computer is increased.
The present invention is also a method of converting a first binary program constituted by a plurality of instruction blocks into a second binary program executed by a computer having branch prediction means, including:
reading execution information collected when the first binary program is executed in advance;
analyzing execution characteristics of a branch instruction between the plurality of instruction blocks of the first program from the execution information;
referring to branch prediction characteristics of the computer; and
updating branch instructions between the plurality of instruction blocks in the first binary program on the basis of the execution characteristics of the branch instruction and the branch prediction characteristics such that a hit rate of the branch prediction is increased and outputting the second binary program.
The present invention is also a program for causing a computer to convert a first binary program constituted by a plurality of instruction blocks into a second binary block executed by a computer having branch prediction means, wherein a program which sequentially performs:
the step of reading execution information collected when the first binary program is executed in advance;
the step of analyzing execution characteristics of a branch instruction between the plurality of instruction blocks of the first program from the execution information;
the step of referring to branch prediction characteristics of the computer; and
the step of updating a branch instruction between the plurality of instruction blocks in the first binary program on the basis of the execution characteristics of the branch instruction and the branch prediction characteristics such that a hit rate of the branch prediction is increased and outputting the second binary program. The program may be recorded on a computer readable recording medium, and the computer readable recording medium may be distributed.
As described above, according to the present invention, the plurality of instruction blocks of the binary program are reconstituted such that the hit rate of a branch prediction is increased. For this reason, a pipeline is prevented from be disturbed, and the performance of a computer having a pipeline process function can be sufficiently achieved.
In addition, according to the present invention, when the computer has characteristics for performing a rather low prediction of the execution probability of a branch with respect to a branch instruction to an address in an upper direction, instruction blocks are reconstituted such that a branch instruction having a low execution frequency in the occurrence of a branch becomes a branch instruction to an address in the upper direction and that a branch instruction having a high execution frequency in the occurrence of a branch becomes a branch instruction to an address in a lower direction, and a pipeline is reduced in disturbance.
Furthermore, according to the present invention, instruction blocks are reconstituted such that a plurality of instruction blocks before and after a branch caused by a branch instruction having a high execution frequency in the occurrence of a branch continue. For this reason, an increase in a hit rate of a cache memory and an increase in a hit rate of a branch prediction can be achieved.
When the computer has characteristics for performing a rather low prediction of the execution probability of a branch with respect to a branch instruction to an address in a lower direction, a plurality of instruction blocks are reconstituted such that a branch instruction having a low execution Frequency in the occurrence of a branch becomes a branch instruction to an address in the lower direction and that a branch instruction having a high execution frequency in the occurrence of a branch becomes a branch instruction to an address in an upper direction. As a result, a pipeline is reduced in disturbance.