1. Technical Field
The present invention relates in general to comparator circuitry, and in particular to a high speed greater than or equal to compare circuit. Still more particularly, the present invention relates to a high speed greater than or equal to compare circuit utilized in a microprocessor dynamic branch prediction system.
2. Description of the Related Art
There are significant uses for high speed greater than or equal to compare circuits in high speed microprocessors. In particular, the greater than or equal to compare circuit is a key element to dynamic branch prediction. In high speed microprocessors, a branch target buffer stores possible branch addresses based on recent instructions executed by the processor. The data read from the branch target buffer will predict whether the next instruction is a branch, and if it is a branch operation, where the branch will probably jump to. This allows instructions to be continuously executed, rather than stopping the flow of instruction to find the branch address data when a branch is reached.
A fundamental component of the dynamic branch prediction is determining whether the potential branch address stored in the branch target buffer is greater than or equal to the current address. If it is determined that the predicted branch address is less than the current address, this potential rapidly, since this calculation is in series with the process of finding the next instruction to be read.
Historically, the prior art has performed greater than or equal to comparisons utilizing a static logic implementation. More recently, the prior an has used dynamic logic greater than or equal to compare circuits because of their enhanced speed. An example of a conventional differential cascode voltage switch (DCVS) dynamic logic circuit is shown in FIG. 1. This DCVS circuit performs a greater than or equal to compare using CMOS transistors in a "series mode" that takes about one nanosecond to produce its output and has seven logic gate delays. This series compare operation is also referred to as a domino compare and it is well-known to those skilled in the art.
As seen in FIG. 1, 5 bits of an X address (X(0:4)) and 5 bits of a Y address (Y(0:4)) are compared and the result shows whether the X bit address is greater than or equal to the Y bit address. With 4 being the high order bit, the X4 and Y4 bits are XNORed (XNOR 10) and then ANDed (AND gates 12-18) with the XNOR (XNOR gates 20-26) of each succeeding pair. If the result from AND gate 18 is a logic high (1), then the X and Y bits are equal.
To perform the greater than compare, the high order bit X4 and the complement of the high order bit Y4 are input into AND gate 28, which results in an output of 1 only when X4 is greater than Y4. The remaining lower order bits are compared in 3-input AND gates 30-36. The inputs to AND gate 30 are the output of XNOR gate 10, X3, and Y3, resulting in a high output only when X4 and Y4 are equal and X3 is greater than Y3. Each of the AND gates 32-36 receive a corresponding X bit, a corresponding Y bit complement, and the output of AND gates 12-16, respectively, resulting in an output of 1 from one of the AND gates 32-36 if the corresponding X bit is greater than the corresponding Y bit and all the preceding bits have been equal. The outputs from AND gates 28-36 are input into OR gates 38-44, the output of OR gate 44 indicating the X address is greater than the Y address by a high output signal. The outputs of AND gate 18 and OR gate 44 are ORed together at OR gate 46 to produce a greater than or equal to result.
It will be appreciated by those skilled in the art that the greater than or equal to compare of FIG. 1 is highly serial in nature. The worst case path has seven logic gate delays. Even after boolean reduction is performed and the DCVS dynamic circuit is optimized to a standard load, it will have a delay of 1.0 nanoseconds and be approximately 12,000 squared microns in size. A static implementation of the circuit in FIG. 1 would probably take up a slightly larger area and be at least 20% slower. Therefore, it would be desirable to provide a greater than or equal to compare circuit that is of substantially greater speed than the prior art designs, while consuming the same amount of chip area as the dynamic or static implementations used in the prior art.