1. Field of the Invention
The present invention generally relates to a superscalar processor (e.g., a processor capable of executing two or more instructions at a time) design for a computing apparatus and specifically to a superscalar processor for following and processing multiple instructions simultaneously.
More specifically, the invention is directed to a register renaming device which resolves inter-instruction operand dependencies within a superscalar processor. To achieve such an object, the invention provides a "source identifier" with a rename tag which reduces the number of compare operations needed in such a system and attendantly decreases chip size and increases chip speed and efficiency.
2. Description of the Related Art
Register renaming is a well-known method in the art in which inter-instruction operand dependencies can be resolved in a superscalar processor. Other well-known methods include register scoreboarding and the like.
Renaming registers are commonly used in pipeline processing systems that operate with superscalar processors so that dependencies between instructions that would otherwise slow processing, can be resolved. Oftentimes, the execution of an instruction must be postponed until the result operand of a previous instruction is derived. Therefore, many instructions are held in reservation stations awaiting the results of execution of other instructions.
The conventional renaming register checks the instruction operands for possible matches between instructions that require information and executed instructions that have the needed information. Each target operand of an instruction receives a rename tag that identifies the rename register that will contain the instruction's result.
To efficiently correlate a result with the instructions requiring that result, a structure must be established that keeps track of the renamed target and the architectural register represented by the renamed operand. Each source operand must be compared to this structure to determine if an instruction in the pipeline will modify the architectural resource, such as a general purpose register (e.g., a GPR), required by the source operand. In such a case, a "dependency" is said to exist.
If there is a dependency, there are two possibilities. First, the instruction that modifies the resource may have executed. Then, the result would be stored in the rename register and could be provided immediately to the instruction needing the information.
In a second possibility, the instruction that is to modify this architectural resource may not have executed. In this case, the source operand receives the rename tag for the resource that it requires. This type of structure/method requires each instruction to broadcast the rename tag for its result as it executes, so that waiting instructions can compare this rename tag with the rename tag associated with the waiting instruction's source operands and can determine if the waiting instruction requires this result. As mentioned below, depending on the architecture of the processor, this structure/method results in a relatively large number of compare operations (and attendant comparator circuits) which must be performed on each operand, and for each execution unit's result bus. Such a conventional system is disclosed in U.S. Pat. No. 5,345,569 to Tran in which the system resolves dependencies in a reorder buffer and identifies whether an instruction is ready to be issued.
Thus, in the conventional systems, a relatively large number of compare operations and an associated large number of comparators must be provided to perform the compare operations. This is a problem which increases chip size and decreases the processing speed and efficiency.
Also, means are required for keeping track of which results are needed by an instruction (e.g., an instruction having a dependent operand). Conventionally, a plurality of multiple comparator circuits associated with each instruction in a reservation station have been employed. The comparators have been used to interrogate results from each of the execution units. Thus, in an exemplary design having twelve (12) instructions in a reservation station, with 5 rewritable fields per instruction, and six (6) execution units, a total of 360 comparators is needed. Such a system has vastly reduced operating speed and dramatically increases the chip size.
FIG. 4 illustrates an exemplary conventional system which utilizes a comparator for each result bus at every operand position in the reservation station. FIG. 4 illustrates a source operand compare circuit 40. The reservation station must include one source operand compare circuit 40 for each source operand. Therefore, if there are nine (9) instructions and each instruction has two source operands, the reservation station requires 18 source operand compare circuits 40.
In FIG. 4, execution units 43 produce result data and a result tag 42. A comparator 41 is required for each execution unit 43 if each execution unit produces one result tag 42. Each of the result tags 42 are compared to a rename tag 44 contained in a rename tag field 45 of the source operand. The result of each of these compare operations is input to an OR function unit 46 (e.g., an eight-way OR function unit), to direct the associated execution unit result data to a source operand data field 47.
In sum, in a superscalar processor design with register renaming, each source operand of an instruction in the reservation station must have a comparator circuit for each of the execution units that can provide results that the instruction may depend upon. Thus, the conventional design typically requires a large number of comparators.