The present disclosure relates generally to a method and apparatus for mirroring units within a processor, and particularly to a method and apparatus for mirroring instruction and execution units within a processor for implementing error detection hardware and for preserving valuable real estate at the processor core level.
Errors may occur in computer hardware that may be transient errors occurring once, randomly, or never again, or they may be “hard” errors, such as when a hardware component breaks and stays broken. Given that hardware can have errors, it is necessary that these errors can be detected. The duplication of instruction and execution units, I-units and E-units, respectively, within the core of a processor chip of a computer system to provide fault detection is well known, where the duplicated units include duplicate instances referred to as base-units and mirrored-units. The outputs of each of these units are sent to a recovery unit (R-unit) where the values of both are compared. A mismatch indicates a hardware fault and the appropriate error recovery action is taken. The outputs of the base and mirror units are also compared in a buffer control element (BCE), with detected errors being forwarded to the R-unit to initiate the appropriate recovery action.
In a processor that implements error detection, the first goal should be protecting the integrity of the data. That is to say, the processor should not allow a “wrong” answer to propagate undetected. At the very least, the processor should checkstop, or present a machine check to the operating system to inform that an error has been detected. More sophisticated processors will implement some type of recovery scheme, such that when an error is detected, the processor will back-up to the last known good instruction and retry the failing operation. The hardware constructs required to provide this level of detection come at a cost in terms of extra circuits, which impacts wireability and cycle time. Some processors will intersperse the error detection logic in with the functional logic. An undesirable result of this implementation is that the required silicon area increases with the amount of error detection. Also, some of the error detection logic can be quite complex, which greatly adds to the development time and cost. To overcome these disadvantages, some processors duplicate sections of logic, and even duplicate entire functional units. In a duplicate implementation, the surrounding units look for discrepancies in the results generated by the duplicated units. This duplicate implementation is desirable in that it decreases complexity and thereby decreases development time, but comes at the cost of increased silicon area, where full duplication will double the silicon area required. Since the duplicated units each need to communicate with the other functional units, they must all be floorplanned close together. This increases wire congestion in the core of the processor increases wire length, and decreases processor frequency.
As cycle time requirements of the processor become more and more aggressive, reaching in excess of 1 Giga-Hertz (GHz), the connecting wires between the mirror units, which are used only for error checking, and other units must be short, thereby requiring that the mirror units be floorplanned at the core level close to the base units, R-unit, and BCE. Also, the mirror-units along with the base-units must be floorplanned in the middle of the processor core. As a result, it is becoming more and more difficult to manage the resulting wire congestion at the core level. Accordingly, there is a need in the art for an improved method and apparatus for mirroring units within a processor.