In many computer systems, error checking and correction (ECC) is used to detect and correct errors in data stored in a memory of the computer system. To protect data using error checking and correction, an algorithm is applied to the data before the data is stored in the memory, with the algorithm generating a corresponding error correcting code. Depending upon the type of error checking and correction being utilized, the code may allow the detection of one or more erroneous bits in the data and may also allow for the correction of one or more such erroneous bits. For example, a common type of error checking and correction is known as single error correction double error detection (SECDED). With this type of error checking and correction, an SECDED code is calculated for data and may be utilized to detect single or double bit errors in the data while allowing for the correction of single bit errors.
A simple parity bit may be viewed as the simplest type of error checking and correction, although technically parity bits allow only for the detection and not the correction of bit errors in store data. A parity bit is a bit appended to a group of data bits and having a value such that the number of binary 1's in the overall word formed by the data and parity bits has either an even or an odd number of binary 1's. In the present description, the term error checking and correction (ECC) is used generally to refer to any type of error detection alone or to any type of error checking and correction. Also, the terms check bits, check byte, and ECC bits, or check word may be used interchangeably in the present description to refer to the bits or groups of bits generated by the ECC algorithm or process being utilized.
FIG. 1 is the block diagram of a conventional memory module 100 including a number of random access (RAM) chips 102a-n. The memory module 100 further includes an ECC RAM chip 104 for storing check bits that allow erroneous bits in the RAM chips 102a-n to be detected and possibly corrected. As the data is stored in the RAM chips 102a-n, circuitry (not shown) on the memory module 100 calculates the corresponding check bits for that data and stores these check bits in the ECC RAM 104. The operation of the RAM chips 102a-n an ECC RAM chip 104 will be understood by those skilled in the art, and thus will now be explained only very briefly. Each of the RAM chips 102a-n has a large number of memory cells (not shown) arranged in rows and columns within the chip. Each memory cell has an associated address and to access that cell a corresponding address is applied to the memory module and then data is either written to or read from the addressed memory cells. Rows of memory cells are typically referred to as pages, with the address applied to the memory module 100 including a row address component corresponding to a respective row or page within the RAM chips 102a-n. In response to a given row address, the corresponding page within the RAM chips 102a-n is accessed and thereafter particular memory cells in the page are accessed as determined by a column address component of the applied address.
In operation, to write data into the memory module 100 an address is first applied to the memory module. In response to the applied address, corresponding memory cells in the RAM chips 102a-n are accessed and the data to be stored in the addressed cells is thereafter written into and stored in these memory cells. From the data being written into the addressed memory cells, circuitry on the memory module 100 calculates corresponding check bits and stores these check bits in the ECC RAM chip 104.
When data is read from the memory module 100, once again an address is first applied to the module. The corresponding memory cells in the RAM chips 102a-n are then accessed and the data is read out of these memory cells. At the same time, circuitry on the memory module 100 accesses the check bits associated with the addressed memory cells. The circuitry then utilizes the data read out of the addressed memory cells to calculate new check bits for this data and compares these new check bits to the check bits read from the ECC RAM chip 104. If the new check bits read from the ECC RAM chip 104 are the same, then there are no errors in the read data. If the new check bits are different from the check bits read from the ECC DRAM chip 104, however, then this means the data stored in the RAM chips 102a-n is now different than the data originally stored in the cells and thus an error in the data exists. Depending on the type of check bits stored in the ECC RAM 104, at this point the circuitry on the memory module 100 may generate an error flag indicating an error in data stored in the memory module has been detected or the circuitry may correct the detected error if possible.
While this type of error checking and correction is satisfactory in many applications, the inclusion of this error checking and correction functionality on the memory module 100 increases the cost of the memory module. This is true due to the requirement for the additional ECC RAM chip 104 and also due to the additional circuitry (not shown) contained on the memory module for calculating the check bits and utilizing the calculated check bits to detect and possibly correct errors in the data stored in the RAM chips 102a-n. As shown in FIG. 1, the ECC RAM chips 102a-n collectively form the actual data storage 106 portion of the memory module 100, with the extra ECC chip 104 merely storing check bits to detect and possibly correct errors in the stored data and not being available for use by programs running on a computer system (not shown) including the module 100.
In addition, the inclusion of error checking and correction on the memory module 100 may result in reduced performance of the memory module particularly during some types of data transfer operations, such as read-modify-write (RMW) operations which can result in consecutive read and write operations to a given page in the RAM chips 102a-n. Such RMW operations take an undesirably long time to complete due to the delay in calculating the check bits for each such consecutive data transfer operation, lowering the overall performance of the memory module 100 as will be appreciated by those skilled in the art.
There is a need for an improved system and method for providing error checking and correction in the memory of a computer or other type of electronic system.