Error-detection and error-recovery features may improve the reliability of conventional microprocessors. However, these features are typically designed to catch permanent faults, such as stuck-at-faults and electromigration issues, or soft errors which are caused by cosmic rays. Delay faults, on the other hand, typically require performance and power guardbands for governing microprocessor operation. Error-detection and error-recovery features designed to address delay faults may therefore allow for significantly less-restrictive guardbands.
Specialized circuits for detecting delay faults may be placed in the critical paths of a microprocessor. The “Razor” technique, for example, samples the path data on the rising clock edge and on the falling edge to detect late-arriving data. More particularly, the Razor technique samples the incoming data on the rising clock edge using a standard datapath flip-flop, and samples the data again on the falling clock edge using a shadow latch. If the samples are different, a delay error has occurred, the pipeline is flushed and the instruction is repeated. This technique allows a microprocessor to run at a frequency very close to its maximum frequency. See “Razor: A Low Power Pipeline Based on Circuit-level Timing Speculation”, MICRO-36, December 2003.
However, operating the microprocessor at its maximum frequency may cause the datapath flip-flops to become metastable. Metastable datapath flip-flops may in turn cause an error which will not be detected by the double sampling. These undetected errors are unacceptable. Addressing such errors via a metastability detector or a metastable-hardened flip-flop is prohibitively expensive in terms of area and power.