Computing devices require storage for data and code to be executed. Temporary storage traditionally provides faster access to data for execution, and traditional temporary storage is implemented with volatile memory resources. Volatile memory finds use in current computing platforms, whether for servers, desktop or laptop computers, mobile devices, and consumer and business electronics. DRAM (dynamic random access memory) devices are the most common types of volatile memory devices in use. As the manufacturing processes to produce DRAMs continue to scale to smaller geometries, DRAM errors are projected to increase. One technique for addressing the increasing DRAM errors is to employ on-die ECC (error checking and correction). On-die ECC refers to error detection and correction logic that resides on the memory device itself. With on-die ECC logic, a DRAM can correct single bit failures, such as through a single error correction (SEC). On-die ECC can be used in addition to system level ECC, but the system level ECC has no insight into what error correction has been performed at the memory device level. Thus, while on-die ECC can handle errors inside a memory device, errors can accumulate undetected by the host system.
In general, error detection and/or correction can vary from the lowest levels of protection (such as parity) to more complex algorithmic solutions (such as double-bit error correction). Parity error generation and checking is fast, and can indicate an error in a long string with a single parity bit, but it provides no correction capability. Double-bit error correction requires more resources (time and code store) to implement, which may not be feasible for on-die ECC in memory devices in high-speed, high-bandwidth applications. While stronger codes provide better error detection and correction, there is a tradeoff with computation time and resources that favors weaker codes in on-die ECC implementations.
In systems that employ SEC, such as DRAMs implementing on-die SEC, the ECC can correct a single bit error (SBE). However, a double bit error can be interpreted and “corrected” as an SBE. The miscorrection of a double bit error as an SBE can actually create a triple bit error in a code word half by toggling a third bit due to misinterpreting the double bit error as an error at a bit indicated by an SEC code. However, given that more complex ECC requires more computation time and resources, it may not be practical to implement stronger on-die ECC.
Descriptions of certain details and implementations follow, including a description of the figures, which may depict some or all of the embodiments described below, as well as discussing other potential embodiments or implementations of the inventive concepts presented herein.