Increased requirements for Reliability-Availability-Serviceability (RAS) or other rating for servers, desktops and other computers may increase the need for low-cost error detection schemes. RAS requirements may result from intrinsic needs of certain market segments (e.g., including mission critical application domains such as aviation, medical and financial transactions processing) as well as from projections for increased reliability of complex designs of the future.
The ability to integrate complex cores that may be both homogeneous (e.g., multi-core and/or many-core) and heterogeneous (e.g., system-on-a-chip (SOC)) may result in increased complexity and cost in circuit design verification, validation and/or testing. In conjunction with possibly less reliable manufacturing processes of the future (e.g., due to higher device sensitivity to process variations) and an inability to test and validate all manufacturing defects and design errors prior to shipment (e.g., at time-0), field failures may increase. As a result, field failures in circuits may be inevitable and may be detected and corrected in the field (e.g., while the system is running) in a user-transparent fashion. Concurrent Error Detection (CED) mechanisms may detect a malfunction of a system by a monitor while the system is running. When an error is detected several steps may be taken to correct the error.
A fault model may be used to analyze (e.g., using simulation) the effect of (e.g., physical and/or silicon) defects in a circuit. Techniques exist for detecting faults and/or errors in a datapath of a circuit (e.g., using residue codes) at a relatively low cost. However, some techniques for detecting faults and/or errors for random control logic are inefficient and costly. Much research effort in the past three decades has focused on finding CED techniques for random control logic that guarantee 100% detection of single stuck-at faults. A single stuck-at fault model is a widely used model for evaluating the effectiveness of an error detection technique. The model may assume that one signal in a circuit is “stuck” at 1 or 0 and that the signal value does not change with time. Typically, these techniques require very high area overhead. The partial protection of hardware is a paradigm that is increasingly gaining importance in the industry. The partial protection scheme attempts to protect the most important parts of a design at low-cost.
It will be appreciated that for simplicity and clarity of illustration, elements shown in the drawings have not necessarily been drawn accurately or to scale. For example, the dimensions of some of the elements may be exaggerated relative to other elements for clarity or several physical components included in one functional block or element. Further, where considered appropriate, reference numerals may be repeated among the drawings to indicate corresponding or analogous elements. Moreover, some of the blocks depicted in the drawings may be combined into a single function.