The present embodiments relate generally to memory in a computing system, and more specifically, to handling errors in memory.
Computer systems often have a considerable amount of cache and high speed random access memory (RAM) to hold information, such as data and programs, temporarily when a computer is powered and operational. This information is normally binary, composed of patterns of 1's and 0's known as bits of data. The bits of data are often grouped and organized at a higher level. A byte, for example, is typically composed of 8 bits; more generally these groups or bytes are called symbols and may consist of any number of bits or sub-symbols.
Memory device densities have continued to grow as computer systems have become more powerful. Unfortunately, the failure of just a portion of a memory device, such as a cache or RAM, can lead to significantly reduced performance. When memory errors occur, which may be “hard” (repeating) or “soft” (one-time or intermittent) failures, these failures may occur as single cell, multi-bit, wordline or bitlines and may cause all or part of the memory device to be unusable until it is repaired.
In the case of failures in a cache, a failure of a bitline is a hard error that causes errors each time a line that includes the failed bitline is accessed. In some cases, the failed bitline in a line may cause an uncorrectable error (UE) when there is a second error (e.g. a soft error or a second bitline error) as the error correction code (ECC) is only able to correct one error in the line at a time. Thus, cache performance is adversely affected by a failed bitline, as it may cause CEs that would ordinarily be quickly corrected by ECC to require additional processes to access the correct data. Therefore, it is important to clean up all errors associated with a first bitline error (e.g., using line delete or array repair) long before a second bitline or a soft error occurs.