1. Field of the Invention
The present invention relates to an instruction execution apparatus in information processing equipment and, more specifically, it relates to an instruction execution apparatus in which the number of entries of an instruction storage device, and a clock frequency, can be increased.
FIG. 1 is a diagram describing a background of the present invention. In this figure, a CPU core section in information processing equipment and, in particular, a computer, is shown. In this figure, the CPU core section comprises an instruction control section 1, an arithmetic unit/result register 2, a first cache 3, and a second cache 4.
The present invention relates to an instruction execution apparatus included in the instruction control section 1.
2. Description of the Related Art
FIG. 2 is a block diagram showing a schematic configuration of the conventional instruction control section 1 shown in FIG. 1. In this figure, there are shown an instruction fetch effective address generator (IFEAG) 201, a branch history address storage section (BRANCH HISTORY) 202, a buffer storage (IFLBS) 203 for storing instructions in the first cache, an instruction buffer (IBUFFER) 204 for fetching and storing the instructions in the first buffer, a decoder 205 for accepting the instructions (four instructions IW0, IW1, IW2 and IW3 in the shown example) at the same time and then issuing them in order, a reservation station address register (RSA) 206 for storing a load instruction address, an effective address generator (EAG) 207, a buffer storage (OPLBS) 208 for storing operands in the first cache, a reservation station for fixed-point arithmetic (RSE) 209, an arithmetic unit for fixed-point arithmetic 210, a reservation station for floating-point arithmetic (RSF) 211, an arithmetic unit for floating-point arithmetic 212, a result register 213 for storing addresses of execution results of instructions such as loads, operations and branches, a general update buffer (GUB) 214 that is a result address buffer for fixed-point arithmetic, a floating address buffer (FUB) 215 that is a result address buffer for floating-point arithmetic, a general purpose register (GPR) 216 for fixed-point arithmetic, a floating purpose register (FRP) 217 for floating-point arithmetic, a reservation station for branch instructions (RSBR) 218, a commitment stack entry (CSE) 219 that will be discussed later in relation to the present invention, and updatable hardware resources 220 such as a next program counter (NPC) and a program counter (PC).
Next, the schematic operation of the above conventional instruction control section will be described.
In response to addresses from the IFEAG 201 or the BRANCH HISTORY 202 via the IFLBS 203, the instruction control section 1 mentioned above stores instructions in the IBUFFER 204, which, in turn, issues instructions, that is four instructions IWR0-IWR3 in the shown example, at the same time. The issue decoder 205 outputs these instructions in order, for example, in the order of IWR0, IWR1, IWR2 and IWR3, which are executed in the EAG 207, the arithmetic units 210 and 212 and the like using a superscalar method, and then, after the instructions such as operations, fetches and branches are completed, entries in the CSE 219 are released in order.
Thus, the CSE 219 is an instruction storage device that stores instructions from the decoder 205 in order and then releases entries in order after the instructions have been executed out of order. Such an instruction storage device will be referred to as the CSE in the following description. Here, the “operation in order” refers to operation in which instructions are processed in the order of issue of the instructions, and the “operation out of order” refers to operation in which instructions are processed irrespective of the order of issue of the instructions.
FIG. 3 is a block diagram showing a schematic configuration of a conventional instruction execution controller. In this figure, the conventional instruction execution controller comprises a decoder that issues instructions in order, a CSE 32 that stores the instructions in order and outputs them in order after the instructions have been executed out of order, a CSE selection section 33 that selects in which entry the instruction is completed among all entries in the CSE 32, a completion condition determination section 34 that determines conditions for actually completing the instruction in the selected entry, and a resource and entry release section 35 that updates CPU resources and releases entries upon completion of the instruction.
The CSE selection section 33 and the completion condition determination section 34 operate within the period of one clock signal cycle.
In this conventional information processing equipment, a process is performed within the period of one cycle of the information processing equipment wherein the entries in the CSE for which instructions should be completed are extracted in the order of execution among all entries in the cycle of CSE completion conditions, then determination of the completion conditions is performed wherein it is determined whether the instructions stored in the extracted entries are completed, and then, if it is determined that the instructions are completed, the entries are released in order. Conventionally, for example, a process wherein three entries are selected from 24 entries in the CSE by the CSE selection section 33 and then the completion conditions are determined is performed in one cycle, however, if the number of entries in the CSE 32 is further increased, the selection of the entries and the determination of the completion conditions may not be completed within one cycle. Moreover, since the clock frequency tends to become increasingly higher, there is a need for a device to allow the operation for the selection of the entries and the determination of the completion conditions to be completed within one cycle.
In order to improve performance of information processing equipment, it is required to increase the number of entries in a CSE and the number of entries that can be released at the same time in one cycle, as well as the clock frequency.
However, considering the circumstances in which the number of all entries in the CSE and the number of the entries released simultaneously in one cycle as well as clock frequency are increased, it will be very difficult to perform the operation in which the entries for which instructions should be completed are extracted, the completion conditions are determined, and then the entries are released in one cycle.
Thus, there is a problem in that the scale of the circuit for extracting entries for which instructions should be completed in one cycle will become larger as the number of entries in the CSE is increased.
Similarly, there is another problem in that, as the number of entries that should be released simultaneously in one cycle is increased, the quantity of circuits that should be controlled simultaneously, and the number of circuit stages will become larger.
Moreover, considering the circumstances in which the clock frequency of the information processing equipment must be faster than conventional equipment, there is still another problem in that it will be very difficult to perform the operation, that was performed in the conventional equipment in only one cycle, for determining conditions to complete instructions.
In particular, as the completion conditions are determined for all entries in the CSE when the instructions are stored in the CSE in order, then the instructions are executed out of order, and then the entries in the CSE are released in order, if the completion of the instructions is delayed, the entries are not released smoothly, and as a result, there is a problem in that operational speed of the computer is reduced since all entries in the CSE 219 are occupied with instructions and the decoder 205 cannot issue instructions.