The present invention relates to a compiler for translating a source program written in a high-level programming language into an object program written in a machine language.
In recent years, programmers have been trying very hard to improve the efficiency in developing a program by writing a program in a high-level programming language like C. The use of a high-level programming language enables a programmer to arbitrarily define a desired number of steps of holding, computing or transferring numerical values in a program using variables. That is to say, a programmer can freely write a program. During this process, a program written in such a high-level programming language (i.e., source program, which is also often called a "source code file") should be compiled, or translated, by a compiler into an object program written in a computer-executable machine language (which is often called an "object code file"). The steps in the machine-executable object program are represented by machine instructions, which require registers or memories as operands. Accordingly, variables should be allocated to these registers or memories. Such allocation processing is called "resource allocation". If optimum resource allocation has been performed successfully, then the code size of the object program can be minimized.
In general, allocating respective variables to registers turns out to be more advantageous in terms of code size and execution time rather than allocating them to memories. However, generally speaking, the number of available registers is relatively small. Thus, the degree of optimization achievable in the resource allocation solely depends on how efficiently variables can be allocated to register resources to execute a machine instruction using the registers as operands. In accordance with a conventional technique of optimizing resources allocation, a plurality of variables, allocable to the same register, are identified based on the respective ranges where the stored values of these variables are alive (in this specification, such a range will be called "variable life range"). Based on the results of this identification, the variables are allocated to the resources.
The present inventors proposed a data processor using the following two types of instruction formats and register models for the execution of instructions in Japanese Patent Application No. 10-59680.
FIGS. 10 through 20 outline the first instruction format.
In the first instruction format, a variable-length instruction with a minimum instruction length of 1 byte is described. A 2-bit field is used as a register-addressing field. Accordingly, four registers can be specified with one register-addressing field. In this architecture, four address registers and four data registers are defined. By separately using the address registers or the data registers responsive to a specific instruction, eight registers can be used in total in executing an instruction.
FIG. 10 illustrates a bit assignment for the first instruction format (1) in which a first instruction field composed of 1 byte, equal to the minimum instruction length, consists of an operation-specifying field and an arbitrary number of register-addressing fields. Specific examples of this format will be described below.
In an exemplary first instruction format (1)-(a), the first instruction field includes two 2-bit register-addressing fields and is composed of 1 byte, which is the minimum instruction length. And two operands can be specified in accordance with this format.
In another exemplary first instruction format (1)-(b), the first instruction field includes two 2-bit register-addressing fields, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 2 bytes or more in total.
In still another exemplary first instruction format (1)-(c), the first instruction field includes one 2-bit register-addressing field and is composed of 1 byte, which is the minimum instruction length. And one operand can be specified in accordance with this format.
In yet another exemplary first instruction format (1)-(d), the first instruction field includes one 2-bit register-addressing field, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 2 bytes or more in total.
In yet another exemplary first instruction format (1)-(e), the first instruction field includes no register-addressing fields and is composed of 1 byte, which is the minimum instruction length. Accordingly, in accordance with this format, no operands can be specified using addresses.
In yet another exemplary first instruction format (1)-(f), the first instruction field includes no register-addressing fields but an additional information field is further provided. Thus, the instruction length in accordance with this format is 2 bytes or more in total.
FIG. 11 illustrates part of a list of specific instructions for respective types of bit assignment shown in FIG. 10. In FIG. 11, instruction mnemonics are shown on the left and the operations performed to execute these instructions are shown on the right.
FIG. 12 illustrates a bit assignment for a first instruction format (2) in which a first instruction field composed of 1 byte, i.e., the minimum instruction length, consists of an instruction-length-specifying field and a second instruction field consists of an operation-specifying field and an arbitrary number of register-addressing fields. Specific examples of this format will be described in detail below.
In an exemplary first instruction format (2)-(a), the second instruction field includes two 2-bit register-addressing fields and the first and second instruction fields are composed of 2 bytes. And two operands can be specified in accordance with this format.
In another exemplary first instruction format (2)-(b), the second instruction field includes two 2-bit register-addressing fields, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 3 bytes or more in total.
In still another exemplary first instruction format (2)-(c), the second instruction field includes one 2-bit register-addressing field and the first and second instruction fields are composed of 2 bytes. And one operand can be specified in accordance with this format.
In yet another exemplary first instruction format (2)-(d), the second instruction field includes one 2-bit register-addressing field, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 3 bytes or more in total.
In yet another exemplary first instruction format (2)-(e), the second instruction field includes no register-addressing fields and the first and second instruction fields are composed of 2 bytes. Accordingly, in accordance with this format, no operands can be specified using addresses.
In yet another exemplary first instruction format (2)-(f), the second instruction field includes no register-addressing fields but an additional information field is further provided. Thus, the instruction length in accordance with this format is 3 bytes or more in total.
FIG. 13 illustrates part of a list of specific instructions for respective types of bit assignment shown in FIG. 12. In FIG. 13, instruction mnemonics are shown on the left and the operations performed to execute these instructions are shown on the right.
Accordingly, in accordance with the first instruction format shown in FIGS. 10 through 13, the instruction length of the first instruction field is used as a basic instruction length to specify a variable-length instruction. And an instruction can be described in this format to have a length N times as large as the basic instruction length and equal to or less than the maximum instruction length, which is M times as large as the basic instruction length (where N and M are both positive integers and 1.ltoreq.N.ltoreq.M). Since the minimum instruction length is 1 byte, this instruction format is suitable for downsizing a program.
FIG. 14 illustrates a first register file 220 included in the data processor proposed by the present inventors. The first register file 220 includes: four address registers A0 through A3; four data registers D0 through D3; a stack pointer (SP) 223; a processor status word (PSW) 224 for holding internal status information and control information; and a program counter (PC) 225.
FIG. 15 is a table illustrating accessing the address and data registers A0 through A3 and D0 through D3 included in the first register file 220 in greater detail. Specifically, this is a table of correspondence among name of a register specified by an instruction, bit assignment on an instruction code specified in a register-addressing field, and number and name of a physical register to be accessed.
In the first instruction format, the set of instruction addressing fields specified by respective instructions to access the four address registers A0 through A3 is the same as the set of instruction addressing fields specified by respective instructions to access the four data registers D0 through D3 as shown in FIG. 15. That is to say, the same 2-bit instruction addressing field is used to address a desired register, and it is determined by the operation of the instruction itself whether an address register or a data register should be accessed.
Next, respective bit assignments for a second instruction format, which is added as an extension to the first instruction format shown in FIGS. 10 and 12, i.e., the basic instruction format of this architecture, will be described with reference to FIG. 16.
In each of the bit assignments shown in FIG. 16 for the second instruction format, a first instruction field, composed of 1 byte, which is the minimum instruction length, consists of an instruction-length-specifying field. And second and third instruction fields consist of an operation-specifying field and an arbitrary number of register-addressing fields. In accordance with the second instruction format, each register-addressing field is composed of 4 bits. Specific examples of this format will be described in detail below.
In an exemplary second instruction format (a), the third instruction field includes two 4-bit register-addressing fields and the first through third instruction fields are composed of 3 bytes in total. And two operands can be specified in accordance with this format.
In another exemplary second instruction format (b), the third instruction field also includes two 4-bit register-addressing fields, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 4 bytes or more in total.
In still another exemplary second instruction format (c), the third instruction field includes one 4-bit register-addressing field and the first through third instruction fields are composed of 3 bytes in total. And one operand can be specified in accordance with this format.
In yet another exemplary second instruction format (d), the third instruction field includes one 4-bit register-addressing field, and an additional information field is further provided. Thus, the instruction length in accordance with this format is 4 bytes or more in total.
Thus, in accordance with the second instruction format, the instruction length of the first instruction field is also used as a basic instruction length. And an instruction can be described in this format to have a variable length N times as large as the basic instruction length and equal to or less than the maximum instruction length, which is M times as large as the basic instruction length (where N and M are both positive integers and 1.ltoreq.N.ltoreq.M).
FIG. 17 illustrates part of a list of specific instructions for respective types of bit assignment shown in FIG. 16. In FIG. 17, instruction mnemonics are shown on the left and the operations performed to execute these instructions are shown on the right. The mnemonic Rm, Rn or Ri indicates the address of a specified register. In this case, a second register file shown in FIG. 18 is defined and any of sixteen general-purpose registers, namely, four address registers A0 through A3, four data registers D0 through D3 and eight extended registers E0 through E7, may be specified. The second register file 120 further includes: a stack pointer (SP) 122; a processor status word (PSW) 123 for holding internal status information and control information; and a program counter (PC) 124.
FIG. 19 is a table of correspondence among name of a register specified during the execution of an instruction defined in the first instruction format, bit assignment on an instruction code specified in a register-addressing field, and number and name of a physical register to be accessed. In accordance with the first instruction format, each register-addressing field is composed of only 2 bits. However, in this case, there are sixteen general-purpose registers, each of which should be accessed using a 4-bit address. Accordingly, address conversion should be performed. For example, in accessing an address register A0 and a data register D1, "1000" and "1101" should be produced as respective physical register numbers and then output to a file 121 of general-purpose registers.
FIG. 20 is a table of correspondence among name of a register specified during the execution of an instruction defined in the second instruction format, bit assignment on an instruction code specified in a register-addressing field, and number and name of a physical register to be accessed. In accordance with the second instruction format, each register-addressing field is composed of 4 bits, which is used as a physical register number as it is.
If variables are simply allocated preferentially to registers rather than memories as is done in a conventional compiler, then the data processor proposed by the present inventors in Japanese Laid-Open Publication No. 10-59680 poses the following problems:
1) A total length of instructions differs depending on whether variables, allocated to the first register file (including register resources), are processed in the first instruction format or variables, allocated to the second register file, are processed in the second instruction format. Accordingly, if these two types of variables are processed equally without prioritizing their allocation at all, then the resulting code size of instructions cannot be minimized. That is to say, in a conventional compiler, it has not been taken into any consideration whether the variables should be preferentially allocated to the first or second register file. For example, if the variables are sequentially allocated to the second register file and processed in accordance with the second instruction format, then the resulting code size becomes longer. This is because the length of one instruction defined by the second instruction format is longer than that defined by the first instruction format. PA1 2) In executing a set of instructions including a data transfer instruction from a memory to a register, the number of instructions where variables are processed in the first instruction format is larger than the number where the variables are processed in the second instruction format. But the total length of instructions in the first instruction format may be shorter than that in the second instruction format. Accordingly, even if variables are simply allocated preferentially to register resources rather than memories, the code size cannot be minimized.