1. Field of the Present Invention
The present invention generally relates to the field of error detection schemes and more particularly to a method and circuit for testing error correction circuitry (ECC) associated with a computer system.
2. History of Related Art
The rapid pace at which the price to performance ratio of microprocessor based computer systems such as personal computers has improved since the 1980's has made the choice of such systems viable in a variety of higher end consumer, business, and scientific applications that were previously served exclusively by more costly main frame computers and workstations. As these smaller class of computers are increasingly being used as enterprise systems (i.e., installed in application intensive environments or used as the backbone of local area networks), the reliability of these machines has become an increasingly important market concern. Indeed, enterprise system consumers require and demand essentially zero down time.
To achieve the level of reliability required to compete in the microprocessor based computer system market, enterprise systems manufacturers have devoted greater consideration to techniques for improving reliability. While fundamental techniques for improving reliability by detecting randomly generated data errors such as the use of parity bits or error correction circuitry have been well known for some time, the use of these techniques in the price intensive market for microprocessor based system was until recently thought not to be cost effective. Manufacturers assumed, probably correctly, that the relatively infrequent occurrence of a single bit or multiple bit error in personal computers would be tolerated by the consumer, especially if the alternative was a higher priced system and the originating cause of the error could not be determined with precision, thereby permitting speculation that the application or operating system software caused the error. Such disregard or lack of concern about system reliability has, however, essentially vanished with the advent of a huge market for low cost, high performance, and highly reliable machines. For example, error correction circuitry is now thought to be a checklist item for all but the lowest end of network servers.
The basic operation of ECC in a computer system is widely known. When data is written to a memory location, the computer system generates additional information known as check bits. The check bits are generated based on a Hamming code or other suitable algorithm to be indicative of the data stored in the memory location. When the contents of the memory location are subsequently read by the computer system, the ECC regenerates the check bits and compares the check bits generated during the read operation with the check bits that were generated during the write operation. Any variation between the check bits generated during the read operation (the expected check bits) and the check bits generated during the write operation (the actual check bits) indicates an error in the data. In a typical implementation of ECC, single bit errors are detected and corrected while double bit failures are detected, but not corrected. The ability of ECC to correct single bit errors represents an advantage of ECC over parity based systems, which are capable of detecting but not correcting single bit errors and are entirely unable to detect certain double bit errors. Until the emergence of 64 bit data paths, however, parity based error checking systems were frequently preferred primarily because typical implementations of parity checking in 32 bit data bus systems requires only 4 parity bits, whereas ECC required 7 check bits are required for 32 bit systems. Thus ECC required 75% more error detection memory than parity based systems. In addition, the parity system's inability to detect double bit failures was not considered significant because of the widely disseminated belief that double bits were so rare that they could be treated as essentially non-existent. With the arrival of 64 bit and wider data busses, however, coupled with the increased demand for reliability, the assumption that double bit failures do not exist is no longer acceptable. Moreover, the cost differential between implementing parity versus ECC largely vanishes in 64 bit systems because 8 bits of error detection memory are required regardless of whether parity or ECC is utilized. Accordingly, ECC is rapidly being accepted as the preferred error detection scheme for microprocessor based computer systems.
The error correction circuitry employed in computer systems is typically one of the cornerstones of improved system reliability. The functionality of the ECC is, therefore, critically important if the ultimate goal of zero down time is to be achieved. Unfortunately, however, the ECC itself is typically not implemented with the significant amount of logic that would be required to perform an adequate self check or diagnostic routine due to size and cost constraints. It would be therefore highly desirable to provide a practical, low cost apparatus for performing a functionality verification of the ECC that consumed a relatively small amount of silicon.