Most general-purpose digital computers provide a system for detecting and handling single-bit or multiple-bit parity errors. The occurrence of parity errors is not uncommon when data signals are being read from storage devices such as static random access memories (SRAMs) and dynamic random access memories (DRAMs). This is especially true when high-density memories are employed, as is generally the case in large data processing systems.
Many factors contribute to the occurrence of parity errors. Sources of contamination such as dust are proportionately increased in size relative to the dimensions of individual transistors employed within high density SRAMs and DRAMs, and are therefore more likely to cause latent defects resulting in parity errors. The presence of alpha particles can also cause parity errors. Alpha particles are randomly generated, positively charged nuclear particles originating from several sources, including cosmic rays that come from outer space and constantly bombard the earth, and from the decay of natural occurring radioisotopes like Radon, Thorium, and Uranium. Concrete buildings, and lead based products such as solder, paint, ceramics, and some plastics are all well-known alpha emitters. Smaller geometry storage devices can be adversely affected by the emission of alpha particles, causing a higher occurrence of parity errors.
In addition to the problems associated with alpha particles and other environmental contaminants, shrinking technology sizes contribute to the occurrence of parity errors. Manufacturing tolerances decrease as geometries decrease, making latent defects more likely to occur. This is particularly true when minimum feature sizes decrease below 0.5 microns.
As discussed above, storage devices such as any type of RAMs are susceptible to the types of error conditions discussed above. This includes control store RAMs of the type often employed to control logic sequencers. It is common, for example, to utilize one or more control store RAMs to control various logic sections of an instruction processor. For instance, consider an instruction decode circuit that is designed to decode an instruction opcode in preparation for instruction execution. The decode circuit may include a control store RAM that stores control signals that may be employed as decoded instruction signals. Specifically, the opcode is presented as an address to a control store RAM. Data read from the RAM may then be used as the decoded instruction to control instruction execution.
Using control store RAMs in the foregoing manner adds flexibility to a logic design. Control over the hardware can be altered by modifying the data stored within the RAMs. As is known in the art, this can be accomplished using a serial scan-set interface, for example. This allows a logic designer to readily add unforeseen changes and/or correct design mistakes. However, as discussed above, these types of devices are prone to parity errors.
One way to detect parity errors is through the use of parity bits, as is known in the art. A detected error may be reported to a maintenance processor, operating system, or other error-handling system, which then initiates some type of recovery action.
Although using parity bits to detect errors provides a relatively straight-forward approach to the foregoing problems, this mechanism is not considered optimal for many control systems that employ control store RAMs. This is because by the time an error is detected in the data word, that error has generally propagated to one or more control lines. As such, operation must often be halted almost immediately so that the error condition can be analyzed and recovery actions can be initiated. This degrades performance, and decreases system resiliency.
Another approach to detecting parity errors involves using an Error Correction Code (ECC). According to this mechanism, check bits provide a code that can be used to detect, and subsequently correct, a parity error. This is desirable where control store RAMs are concerned, since corrected RAM data is then available to control logic sequences, allowing execution to temporarily continue unaffected. The error can be addressed at a later time when the system is configured for error analysis and recovery.
One problem with using an ECC mechanism to detect a parity error is that, in general, a relatively large number of check bits are required to detect and correct an error. For example, a typical ECC scheme that is applied to computer memories is the Single Error Correcting/Double Error Detecting (SEC/DED) type of code that requires eight check bits to correct an error in a sixty-four bit word. This type of code is referred to as a “[64,72]” ECC code. These check bits must be stored along with the data word. However, because control store RAMs may be hundreds of bits wide, storing the number of check bits that are required to perform error correction will require that the RAM width be increased by a substantial amount. For example, a RAM that is three hundred twenty bits wide and employs five [64,72] SEC/DED codes to provide ECC coverage will require the storing of forty check bits. This may increase the size of the RAM beyond what is acceptable for the particular control store device application.
What is needed, therefore, is an improved system and method for detecting, then correcting, errors in a control store RAM that addresses the foregoing problems.