The present invention relates to a translation system for translating a source program into a machine language program by using an electronic computer and more particularly to a translation system of the type mentioned above in which an object program common to a plurality of target computers or machines of different types can profitably be employed.
Systems for allowing programs described in high-level languages to be executed on the target computers or machines are generally classified into two systems respectively referred to as a compiler system and an interpreter system.
In the compiler system, a program described in a high-level language is translated into a machine language program oriented to a target computer, and the machine language program is executed straightforwardly by the target machine.
On the hand, in the case of the interpreter system, a language (referred to as the intermediate language) which differs from the machine language of the target computer is prepared along with a program (referred to as the interpreter) which is adapted to interpret and execute the intermediate language program on the target computer. In other words, the high-level language program is once translated into the intermediate language program which is then executed by the target computer or machine on which the interpreter program runs.
One of advantages of the compiler system over the interpreter system is seen in a high-speediness of program execution which can be explained by the facts mentioned below.
(1) In the interpreter system, there are required in addition to the execution of a machine language program corresponding to an intermediate language program, allocation of the processings for the intermediate language codes as well as address calculation for operands and others. On the other hand, in the compiler system, such processing allocation and address calculation are rendered unnecessary because the machine language program can directly be executed in a straightforward manner.
(2) In the compiler system, sparing or deletion of some of the processings is possible by taking into consideration the context of program and characteristics of the target computer (i.e. program optimalization can be realized). In contrast, the interpreter can only execute the intermediate language program as it is because of its universalness to the intermediate languages and thus the interpreter is not in the position to allow any processing to be spared or omitted in consideration of the program context. Besides, since the characteristics of the target computer or machine are not reflected onto the intermediate language program, it is impossible to speed up the processing by resorting to, for example, mapping of specific variables described in a high-level language to the registers incorporated in the target machine.
On the other hand, as to the usage of a program destined to be executed repetitively, there has heretofore been adopted either one of the two methods mentioned below.
(1) According to a first method, the compiler system is adopted, wherein the machine language program obtained through the translation is preserved or stored so as to be repetitively executed in a straightforward manner.
(2) According to the other method, the interpreter system is adopted, wherein the intermediate language program is stored for allowing repetitive executions thereof by the interpreter.
When one program is to be executed repetitionally, the compiler system is adopted by an overwhelming majority from the viewpoint of reduction of the time involved in execution of the program. However, the compiler system suffers from the undermentioned shortcomings.
(1) It is necessary to provide the compiler for translating a source program into a machine language program for each type of the target machine, which means that not only a quantity of compilers to be developed will necessarily increase but also overhead involved in maintenance and extension is significantly increased because the maintenance and extension must be performed so as to be compatible with the machine types of the target computers.
(2) In the case where one and the same program is to be executed by a plurality of target computers of different machine types, compilation (i.e. translation from a source program to a machine language program) is required for each of the machine types of the target computers, which results in that overhead in the management of the machine language programs increases remarkably.
(3) In an environment in which a plurality of computers of different machine types are connected to a network, a number of machine language programs which correspond to the number of the computers connected to the network are required for one and the same source program, which gives rise to problems with regards to the version management and disk space availability. Moreover, difficulty will be encountered in distributed execution of one and the same program.
(4) Some of the systems used actually is often operated with only the machine language program without any source program given. In such system, exchange or switching and alterations of the component machines is difficult to realize. At present, progress in the hardware technology facilitates implementation of highly sophisticated computer architecture. Nevertheless, inheritance of the machine language program resources imposes a serious limitation to alteration or modification of the computer architecture.
For overcoming the disadvantages of the compiler system mentioned above, such a system may be conceived in which an intermediate language program which is independent of any specific machine is employed for the purpose of preservation or storage and management of the program, wherein upon execution, the intermediate language program is translated into a machine language program of a target machine for thereby realizing a high-speed processing, i.e. a system which adopts only the advantageous features of the compiler system and the interpreter system in combination. In the present state of the art, however, there is known no real system which incarnates the concept mentioned above.
Parenthetically, for details of the compiler system and the interpreter system, reference may be made to xe2x80x9cA. Aho, R. Seti and J. Ullman: Compilers. Principles, Techniques and Toolsxe2x80x9d, Addison-Wesly, 1986, pp. 1-24.
In order to allow a machine-independent intermediate language program (i.e. intermediate language program which is independent of any specific target machine or computer) to be adopted as a form for preservation and management of a program to be executed repeatedly, it is required that the intermediate language program can be executed at a speed comparable to that of execution of the machine language program in the existing compiler system.
To this end, fulfillment of the requirements mentioned below will be indispensable.
(1) The intermediate language program which is in the form suited for the preservation and management as described above is not executed by the interpreter but translated into a machine language program immediately before the execution.
(2) In the course of the translation or conversion of the intermediate language program into the machine language program, optimalization of the program is carried out by taking into consideration the characteristics of the target computer which is to execute that program.
With the present invention, it is contemplated to provide a consolidated or integrated system which can realize the requirements mentioned above, i.e. to provide a practical form of an intermediate language for storage and management of the intermediate language program together with a practical method of effectuating the translation of the intermediate language program into the machine language program upon start of execution of the program while optimalizing the machine language program for the target computer.
In this conjunction, it is noted that the intermediate language code designed for the interpreter system can not be used as the intermediate language codes for realizing what is contemplated with the present invention for the reasons described below.
(1) The intermediate language code For the interpreter system contains no information required for optimalization to be effectuated upon translation into the machine language program because the intermediate language codes are not designed on the premise that it undergoes the optimalization by the interpreter.
(2) The computers may globally be classified into a register machine which includes a finite number of registers and in which operations are performed primarily on the registers and a stack machine which includes operation stacks, wherein the operations or computation is performed primarily on the stack. In the current state of the art, a majority of the existing computers are implemented as the register machines. By contrast, many of the intermediate languages for the interpreter systems are designed on the presumption of operation on the stack because of the ease in designing the intermediate language codes and the interpreter. Of course, it is not absolutely impossible to convert the on-stack operation to the operation on the registers. However, a great difficulty will be encountered in translating the intermediate language program for the stack machine into an efficient and effective machine language program for the register machine, when considering the fact that the values on the stack are inherently assumed to be disposable, while those on the registers should rationally be used repetitively as far as it is possible in order to make the most of the registers with a high efficiency.
It is therefore an object of the present invention to provide information processing method and system in which an intermediate language program independent of any specific computer or machine is used for storage, management and the like purpose and translated into a machine language program appropriate to a target machine immediately before execution of the program by the target machine. More specifically, it is contemplated with the present invention to provide an information processing system which can fulfill the requirements described below.
(1) Putting preponderance on a register machine as the target machine (i.e. execution-destined computer), the intermediate language program be of such an instruction sequence in which existence of registers is presumed at the very level of intermediate language program. Besides, in the course of translation up to the intermediate language program, optimalization should have been effectuated to a possible extent.
(2) Upon translation into the machine language program from the intermediate language program, a register utilization method should be able to be optimalized. More specifically, utilization of the registers should be so determined that the number of times the instructions for loading and storing values to and from registers should be reduced to a minimum while allowing unnecessary instructions to be deleted. Moreover, information requisite for the optimalization should be derived from the intermediate language program.
(3) In some case, a specific sequence (a series of plural instructions) in the intermediate language program can be replaced by an instruction peculiar to the target machine. In such case, it is preferred in general from the standpoint of efficiency to effectuate the replacement by one machine language instruction. Accordingly, when a machine language instruction corresponding to a succession of intermediate language instructions exists availably by the target machine, a machine language program should be generated such that the corresponding machine language instruction mentioned above can be made use of.
Aspects of the present invention in general may be summarized as follows.
1. System structure
According to an aspect of the present invention, a system for translating a source program into a machine language program for an execution-destined computer or target machine is composed of three subsystems. They are:
(1) a compiler: a subsystem for generating an object program (referred to as abstract object program) which is independent of the type of the target machine,
(2) a linker: a subsystem for linking together a plurality of abstract object programs generated by the subsystem compiler into a single abstract object program, and
(3) an installer: a subsystem for translating the abstract object program outputted from the linker into a machine language program for the target machine (which may also be referred to as the target computer, execution-destined machine or the like).
2. Form of object program
In order to make the object program common to a plurality of target machines, an abstract register machine (also referred to as ARM or Arm in abbreviation) having a plurality of registers is presumed, wherein an instruction sequence for the abstract register machine or ARM is made use of as a basic part of the common object program (referred to as the abstract object program).
The abstract register machine or ARM has features mentioned below.
(1) The ARM has a plurality of abstract registers. (Although the number of the abstract registers is infinite in principle, limitation is imposed in dependence on the form of the abstract object program in practical applications.)
(2) The ARM has as the instruction executing functions a register-memory data transfer function, function for performing operations on the registers (such as four arithmetic operations, logical operations, shift operations, comparisons) and an execution control function (such as unconditional branch, conditional branch, call and restoration of subprograms, etc.).
(3) Memory addresses are represented by symbol names rather than numerical values.
The reason why the instruction sequence of the abstract register machine or ARM including a plurality of registers is made use of as the basic part of the common object can be explained as follows:
(a) In order to speed up the translation of the object program into machine language programs appropriate to the individual target machines, respectively, it is desirable to reduce as for as possible semantic gaps between the object program and the machine language. In this conjunction, it is to be noted that the computation machine used widely at present is a register machine having a plurality of registers. Accordingly, by presuming the abstract register machine having as an instruction set a semantically common part of the instruction sets of the conventional register machines, overhead involved in the semantically meaningful translation can be reduced, whereby the translation to the machine language program can be speeded up.
(b) In the register machine, one of the keys for speeding up execution of the machine language program is effective utilization of the registers. Thus, by regarding the abstract register machine or ARM as a target machine for the compiler, the latter can generate a instruction sequence which can make use of the registers to a maximum possible extent.
The abstract object program is composed of:
(a) instruction sequence forth ARM,
(b) pseudo-codes such as definitions of labels concerning branch, entry, variable and constant, embedded information for the optimalization, and embedded information for the debugging at the source program level,
(c) generation control specifiers (indicating allocation/deallocation of abstract registers and selection of an ARM instruction sequence in the state in which the abstract registers have been allocated), and
(d) dictionaries of variable names and index names for reference in the debugging at the level of the source program.
Although the type of the abstract register can be indicated by the generation control specifier for the abstract registers, it is impossible to designate to which of the registers in the real machine the indicated abstract register correspond. In this manner, the registers can be surfaced up in the abstract object program independent of the type of the target machine.
3. Specification of target machine
In order to allow the machine language programs for the target machine to be generated from the abstract object program, tow types of information mentioned below are prepared for the installer:
(i) indication concerning the usage of the register in the target machine (types and number of the usable registers), and
(ii) translation rules for translation of the instruction sequence pattern of the ARM into an instruction sequence pattern for the target machine.
At this juncture, it should be mentioned that the ARM instruction pattern and that of the target machines are each composed of a plurality of instructions.
The instructions of the ARM and those of the target machine are not set in one-to-one correspondence relation, the reason for which is explained as follows.
(1) In a strict sense, the ARM instruction set can not constitute a common part of real machine instruction sets. Accordingly, there may arise such situation in which the instruction corresponding to that of the ARM is absent in the instructions executed by the a target machine. In that case, it is necessary to realize one ARM instruction by several instructions of the target machine.
(2) There may arise a situation which is reverse to that mentioned above. In other wards, the instruction sequence executed by the target machine may include an instruction which corresponds to a sequence of several ARM instructions, as exemplified by a register-memory operation instruction and the like. In this case, the ARM instruction sequence is handled as one target machine instruction, because the processing speed can be enhanced by decreasing the number of the instructions to be executed by the target machine.
4. Method of translating the abstract object program into a machine language program for a target machine.
The installer includes a table for managing correspondences between the abstract registers of the ARM and the real registers of the target machine (this table will hereinafter be referred to as the register management table) and performs operations mentioned below.
(1) In response to an abstract register allocation command for the abstract registers in the abstract object program, the installer attempts to establish correspondence between an abstract register and a real register. In that case, when there exists a real register for which correspondence with other abstract register has not been established within a range described in register usage indication of the target machine specifications, i.e. where there is found an idle real register, correspondence is established between the aforementioned abstract register and the idle real register.
(2) In response to a register releasing or freeing command contained in the abstract object program, the installer clears the correspondence relation between the abstract register and the real register (i.e. deallocation is executed by the installer).
(3) With the aid of the generation control specifier contained in the abstract object program, the installer checks whether or not an abstract register is set in correspondence relation with a real register, whereon an ARM instruction sequence is selected.
(4) For the ARM instruction sequence thus selected, the installer applies translation rules contained in the target machine specifications for translating the ARM instruction sequence pattern into a target machine instruction sequence pattern, to thereby generate a target machine instruction sequence corresponding to the selected ARM instruction sequence while replacing the abstract register identification number by that of the real register.
(5) For the target machine instruction sequence thus generated, the installer converts the symbol name representing the memory address into numeric addresses.
The compiler which may be implemented by applying compiler techniques known heretofore serves for translation of a source program into an abstract object program. Simultaneously, the compiler performs optimalization at the source program level as well as optimalization of the ARM instructions for which utilization of the registers is prerequisite.
In this way, the installer generates a machine language program for a target machine from an abstract object program in conformance with the target machine specifications.
Thus, the machine language program for the target machine as generated by the system according to the invention has features mentioned below.
(1) Optimalization at the source program level as well as optimalization of the register utilization is realized by the compiler.
(2) Owing to the real register allocation function of the installer, the registers incorporated in the target machine can be made use of to a maximum extent.
(3) Owing to the instruction sequence pattern replacing capability or function of the installer, it is possible to replace a succession of plural ARM instructions by one instruction for the target machine, to thereby make the most of a high performance of the target machine. As a result of this, an object program can be shared by a plurality of different machine type computers while maintaining the execution speed and performance which are equivalent to those involved in the execution of the machine language program generated by the prior art system (prior art compiler system).