1. Field
The present invention relates generally to electrical and electronic devices and circuits, and more particularly relates to data integrity in processor-memory interfaces.
2. Related Art
Conventional sub-micron system-on-a-chip (SoC) product designs typically include a large number of integrated components. These designs also generally include one or more processors or central processing units (CPUs) with associated memory in addition to several other components, which may be product and application specific. However, integrating this memory using sub-micron technology is susceptible to various noise-injection events, such as, alpha particle strikes, and the like that can corrupt contents of the memory resulting in a loss of data integrity and/or system failure. Accordingly, schemes, such as parity and error correcting code (ECC) protection, have been developed to address these failures.
Data integrity schemes, such as parity error protection, are simple and require low overhead, but are only able to detect single-bit failures and are unable to correct detected errors. In comparison to parity error protection, ECC schemes can detect a greater number of errors and can even correct such errors to some extent, depending on the ECC scheme utilized. For example, a single-bit error correction double-bit error detection (SECDED) ECC scheme can detect double-bit errors and correct single-bit errors. However, implementing ECC requires the insertion of additional ECC code bits into each data word, thereby increasing system complexity, overhead, and cost.
If ECC is used and a unit of data or a word is stored in memory, such as random access memory (RAM) or peripheral storage, such as a hard disk drive (HDD), a code that describes the bit sequence in the word is typically calculated and stored along with the word. For example, for each 64-bit word, an additional eight bits of memory are stored to represent the ECC code. When the word is accessed during a read operation, a code for the stored word is again calculated using an ECC algorithm. The newly generated code is compared with the code generated when the word was initially stored and, if the codes match, the data is considered to be valid, and free of errors. When the codes do not match, the erroneous bit or bits are corrected, assuming the number of detected bit errors does not exceed the error correction capability of the ECC scheme employed. For single-bit error correction double-bit error detection code, only single-bit errors can be corrected and double-bit errors can be detected.
Conventionally, an attempt to correct data stored in memory is not made until that data is to be accessed. Eventually, the stored data will be overwritten by new data and, assuming the errors are transient, the incorrect bits will not be an issue. Errors that recur at the same address in memory after the system has been restarted likely indicate a permanent hardware error that may trigger a message to be sent to a log or system administrator providing the address associated with the recurrent error.
The implementation of ECC has become increasingly popular in higher reliability applications, such as data storage and transmission, particularly as data rates, and thus error rates increase. Thus, based on the particular application and prevailing operating conditions, a SoC may require different schemes to ensure data integrity on the same memory interface port.