1. Field of the Invention
The present invention relates to a binary program conversion apparats and method for converting an original binary program into a new binary program that runs faster in a computer. Moreover, the present invention relates to a program recording medium with a program for making a computer to function as such a binary program conversion apparatus.
2. Description of the Related Art
To improve the performance of computer systems, many efforts have been made over the years to speed up the main memory. However, the development of cache memory has enabled an improvement in the performance of computer systems without speeding up the main memory.
The cache memory is a small, fast memory located between the main memory and the CPU. In the computer system having the cache memory, when a word (an instruction or data) is requested by the CPU, it is examined whether the requested word exists in the cache memory. If the requested word exists in the cache memory, the word in the cache memory is forwarded to the CPU. On the other hand, if the requested word does not exist in the cache memory, a fixed-size block of data, called a block or line, containing the requested word is retrieved from the main memory and placed into the cache memory. Then, the requested word contained in the data (line) in the cache memory is forwarded to the CPU.
That is, programs tend to reuse instructions and data which they have used recently and also tend to use words whose addresses are near one another in a short time. (Note that the former property and the latter property are called temporal locality of reference and spatial locality of reference, respectively.) In other words, the words in the main memory which the CPU will request in the near future in executing a program can be anticipated to some degree.
Since the cache memory stores such words, the CPU in the computer with the cache memory can obtain majority of words which are necessary to execute a program at the speed corresponding to not the access speed of the main memory but the access speed of the cache memory. As a result, a computer with a cache memory operates at the same speed with a system with a high-speed main memory and without a cache memory.
Moreover, since a computer with a cache memory needs a high-speed memory of small capacity, the use of the cache memory decreases the cost to construct a computer with a certain performance. Consequently, mostly of recent computers are provided with cache memories.
In addition, recent popularization of personal computers have decreased the price of the memory devices. As a result, modifying a program (software) so that the hit ratio of the main memory may increase in order to improve the performance of the computer becomes hardly significant. Because, in the computer to which many memory devices that became cheap are installed for actualizing a main memory of a large capacity, the hit ratio of the main memory is so high as not to influence the operation speed.
Therefore, for improving the performance of a computer by means of modifying a program, it is more effective to modify the program so that the hit ratio of the cache memory will be increased. That is, it becomes more important that improving the locality of reference of programs to draw out the performance of the cache memory to its maximum.
So far, to improve the hit ratio of the cache memory in a certain computer, a source program is compiled by a compiler developed for the computer. That is, to utilize this re-compile technology, each program user has to keep not only binary programs, which are programs used in practice, but also their source programs.
Moreover, since the advancement of the computer hardware is very fast and it takes time to develop a new compiler corresponding to a new computer architecture, more advanced hardware may be developed before the development of the compiler is completed. That is, there is a problem that correspondence by means of the conventional re-compile technology did not catch up with the advancement of hardware.
For these reasons, a program conversion technology which can convert a program into a new program suitable for a target computer without using its source program is wished for.
An object of the present invention is to provide a binary program conversion apparatus and a binary program conversion method which can convert an original binary program into another binary program that runs at high speed in a computer without using a source program of the original binary program.
It is another object of the present invention is to provide a program recording medium with a program for making a computer to function as such a binary program conversion apparatus.
A binary program conversion apparatus according to a first aspect of the invention is used for converting a first binary program which consists of a plural of first instruction blocks into a second binary program which is executed in a computer having a cache memory.
The binary program conversion apparatus comprises an executing part, a generating part and a producing part. The executing part executes the first binary program.
The generating part generates executed blocks information indicating first instruction blocks which are executed by the executing part.
The producing part produces, based on the executed blocks information generated by the generating part, the second binary program which contains second instruction blocks corresponding to the plural of the first instruction blocks and which causes, when being executed in the computer, the computer to store second instruction blocks corresponding to the first instruction blocks executed by the executing part at different locations of the cache memory.
Thus, the binary program conversion apparatus of the first aspect converts a binary program (first binary program) into a new binary program (second binary program) which runs fast in the computer with the cache memory by carrying out processes involves rearranging binary codes (executed blocks). Therefore, according to this binary program conversion apparatus, binary programs suitable for the computer having the cache memory can be obtained without using (managing) their source programs.
Actualization of the binary program conversion apparatus according to the first aspect involves the use of the producing part which produces the second binary program including a part in which the second instruction blocks corresponding to the first instruction blocks executed by the executing part arranged successively.
The binary program conversion apparatus according to the first aspect of the invention may further include a creating part and a controlling part. The creating part create a line data indicating lines of the cache memory which are to be used when the second binary program produced by the producing part is executed in the computer.
The controlling part controls the executing part so as to execute a third binary program which consists of a plural of third instruction blocks at first. Then, the controlling part controls the generating part so as to generate second executed blocks information indicating third instruction blocks executed by the executing part. Moreover, the controlling part controls the producing part so as to produce a fourth binary program which causes, when being executed in the computer, the computer to store fourth instruction blocks corresponding to the third instruction blocks executed by the executing part on different locations on lines of the cache memory excluding the lines indicated by the line data.
According to the thus constructed binary conversion apparatus, two binary programs (first, third binary programs), which are executed at the same time, can be converted into new two binary programs (second, fourth binary programs) which run at high speed in the computer.
The binary program conversion apparatus may further include a recognizing part which recognizes frequency of use for data access of every line of the cache memory by monitoring running states of the first binary program executed by the executing part, and the selecting part which selects, based on a recognition result of the recognizing part, lines from all lines of the cache memory to use for storing the second instruction blocks corresponding to the first instruction blocks executed by the executing part. In this case, adopted is the producing part which produces the second binary program which causes, when being executed in the computer, the computer to store the second instruction blocks corresponding to the first instruction blocks executed by the executing part on different locations on the lines selected by the selecting part.
According to the thus constructed binary conversion apparatus, a new binary program (second binary program) which causes no or little conflict miss between an instruction access and a data access can be obtained.
Furthermore, when employed is the producing part which produces the second binary program including a part in which the second instruction blocks corresponding to the first instruction blocks executed by the executing part arranged successively, the binary program conversion apparatus may be constructed by adding a searching part and a changing part.
The searching part searches from the binary program produced by the producing part a second instruction block containing on its end a conditional branch instruction whose branch destination is set for a next second instruction block. The changing part changes a branch condition and the branch destination of the conditional branch instruction so that a transition from the second instruction block searched by the searching part to the next second instruction block takes place without branching.
According to the thus constructed binary conversion apparatus, a new binary program (second binary program) which runs at higher speed can be obtained.
A binary program conversion apparatus according to a second aspect of the invention is used for converting a first binary program into a second binary program which is executed in a computer.
The binary program conversion apparatus according to the second aspect comprises a searching part and a producing part. The searching part for searching from the first binary program first instruction strings each of which consists of at least one predetermined instruction code. The producing part for producing the second binary program by replacing the respective instruction strings searched by the searching part with second instruction strings assigned to the first instruction strings.
Thus, the binary program conversion apparatus of the second aspect converts a original binary program (first binary program) into a new binary program (second binary program) which runs fast in the target computer by replacing instruction strings in the original binary program. Therefore, according to this binary program conversion apparatus, binary programs suitable for the target computer can be obtained without using (managing) their source programs.
Note that, the binary conversion apparatus of the present invention may be realized by running a corresponding program recorded in a program medium in a conventional computer.
A binary program conversion method according to a first aspect of the invention is used for converting a first binary program which consists of a plural of first instruction blocks into a second binary program which is executed in a computer having a cache memory.
The binary program conversion method according to the first aspect comprises an executing step, a generating step and a producing step. In the executing step, the first binary program is executed. In the generating step, executed blocks information indicating first instruction blocks executed in the executing step is generated. In the producing step, based on the executed blocks information generated in the generating step, the second binary program is generated which contains second instruction blocks corresponding to the plural of the first instruction blocks of the first binary program and which causes, when being executed in the computer, the computer to store the second instruction blocks corresponding to the first instruction blocks executed in the executing step on different locations of the cache memory.
The binary program conversion method according to the first aspect may adopt the producing step which involves producing the second binary program including a part in which the second instruction blocks corresponding to the first instruction blocks executed in the executing step arranged successively.
It is feasible to further add, to the program conversion method, a creating step and a controlling step. The creating step involves a process of creating a line data indicating lines of the cache memory which are to be used when the second binary program produced in the producing step is executed in the computer. The controlling step involves processes of controlling the executing step so as to execute a third binary program which consists of a plural of third instruction blocks, and of controlling the generating step so as to generate second executed blocks information indicating third instruction blocks executed in the executing step, and of controlling the producing step so as to produce, based on the second executed blocks information and the line data, a fourth binary program which contains fourth instruction blocks corresponding to the plural of the third instruction blocks and which causes, when being executed in the computer, the computer to store the fourth instruction blocks corresponding to the third instruction blocks executed by said executing means at different locations on lines of the cache memory excluding the lines indicated by the line data.
It is possible to further add, to the binary program conversion method according to the first aspect of the invention, a recognizing step and a selecting step.
The recognizing step involves a process of recognizing frequency of use for data access of every line of the cache memory by monitoring running states of the first binary program executed in the executing step. The selecting step involves a process of selecting lines from all lines of the cache memory to use for storing second instruction blocks based on a recognition result of the recognizing step. Incidentally, in this case, there is employed the producing step involves producing the second binary program which causes, when being executed in the computer, the computer to store second instruction blocks corresponding to the first instruction blocks executed in the executing step on different locations on the lines selected in the selecting step.
It is desirable to add a searching step and a changing step to the binary program conversion method in which adopted is the producing step involving a process of producing the second binary program including a part in which the second instruction blocks corresponding to the first instruction blocks executed in the executing step arranged successively.
The searching step involves a process of searching a second instruction block containing on its end a conditional branch instruction whose branch destination is set for a next second instruction block. The changing step involves a process of changing a branch condition and the branch destination of the conditional branch instruction so that a transition from the second instruction block searched in the searching step to the next second instruction block takes place without branching.
A binary program conversion method according to a second aspect of the invention is used for converting a first binary program into a second binary program which is executed in a computer. The binary program conversion method comprises a searching step and a producing step.
The searching step involves a process of searching from the first binary program first instruction strings each of which consists of at least one predetermined instruction code. The producing step of producing the second binary program by replacing the respective instruction strings searched in the searching step with second instruction strings assigned to the first instruction strings.