A vital component of virtually all computer systems is a semiconductor or solid-state memory system. Such memory often holds both the programming instructions for a processor of the computer system, as well as the data upon which those instructions are executed. In one example, the memory system may include one or more dual in-line memory modules (DIMMs), with each DIMM carrying multiple dynamic random access memory (DRAM) integrated circuits (ICs). In addition, one or more processors may be coupled with the memory modules through a memory controller, which translates data requests from the processor into accesses to the data held in the memory modules.
Computer systems have benefited from the ongoing advances made in both the speed and capacity of memory devices, such as DRAMs, employed in memory systems today. However, increasing memory data error rates often accompany these advancements. More specifically, both “hard errors” (permanent defects in a memory device, such as one or more defective memory cells) and “soft errors” (data errors of a temporary nature, such as inversion of data held within one or more memory cells) tend to become more prevalent with each new technology generation.
To combat these errors, memory controllers in commercial computer systems often support an error detection and correction (EDC) scheme in which redundant EDC data is stored along with the customer, or “payload,” data. When these data are then read from the memory, the memory controller processes the EDC data and the payload data in an effort to detect and correct at least one data error in the data. The number of errors that may be detected or corrected depends in part on the nature of the EDC scheme utilized, as well as the amount of EDC data employed compared to the amount of payload data being protected. Typically, the more EDC data being utilized, the higher the number of errors being detected and corrected, but also the higher the amount of memory capacity overhead incurred.
Due to the extra cost involved, some memory systems do not employ an error detection or correction capability. Further, in spite of the use of an EDC scheme, the error rates of the memory devices may overwhelm the capability of the memory system to detect and correct the errors. To address these errors, some memory systems may provide a spare DIMM to be used as a data “mirror” to store a second copy of data to protect the system against the failure of an in-use DIMM. However, similar to the use of EDC, the employment of one or more spare DIMMs also increases the cost and memory overhead associated with the memory system. In addition, memory systems employing a DIMM as a data mirror for an in-use DIMM typically are configured such that the memory controller must write the same data to both an in-use DIMM and a mirror DIMM as two separate write operations, thus essentially reducing the memory system bandwidth by half.