The use of solid-state memory in computers and electronic devices has become widespread. Technological improvements have made it possible to continually increase the amount of memory, particularly dynamic random access memory (DRAM), in these devices. The increased memory helps to improve the usefulness of these devices, by making them faster (by replacing slower physical (magnetic) memory with faster solid-state memory), use less power (solid-state memory tends to consume less power than the mechanical devices used in magnetic memory), and more powerful (increased memory permits larger applications to be run on the devices.
However, as memories get larger and larger, the probability of an error occurring within the memory increases. Errors in a memory may be classified into two categories: hard and soft. Hard errors may be the result of defects in the actual physical structure of the memory, such as a faulty capacitor or transistor. Soft errors on the other hand may be the result of glitches induced by error sources such as alpha particles, gamma rays, radiation, electrical noise, and so forth. Soft error may be thought of as being transient in nature.
There are several different ways to detect and correct errors. A first method involves the use of a parity bit per group of memory bits. For example, a single parity bit may be used to cover an eight-bit group (a byte) of memory values. One way to use the parity bit is to perform a count of the number of bits in the byte with values equal to one. Then, if the count is an odd number, the parity bit may be set to one and if the count is even, the parity bit may be set to zero. Parity bits however, can only be used to detect the presence of an error, they cannot correct errors.
More sophisticated methods are available that can detect and correct errors. These methods typically use more than one bit per group of memory bits that they are protecting and involve a two-step process wherein an error is first detected and then the error is corrected. For example, a natural place to detect and correct errors in memory is when information is read from a memory location. When the information is read from the memory location, error detecting hardware can be used to determine if an error has occurred. If an error has occurred, then there are generally two places where the error needs to be fixed: the memory location itself and the information read from the memory location. The information as read from the memory location can be readily corrected (in many instances, the circuitry that detected the error can automatically correct the error); however, correcting the contents of the actual memory location requires that the corrected information be written back to the memory location.
A commonly used technique involves immediately writing the corrected information back to the memory location. This technique uses additional time inserted into the memory read cycle to write the corrected information back to the memory location. It has an advantage in that the corrected information is immediately written back as soon as an error has been detected. This can prevent consistency problems that may arise when the information is corrected at one location (the storage location) and not at another (the memory location).
Another commonly used technique involves the use of idle cycles in the memory access period to write the corrected information back to the memory location. This has an advantage in that the read cycles do not need to be extended to support the write operation.
One disadvantage of the prior art is that the insertion of additional time in the read cycle to support the write operation lengthens the duration of the read cycle. This hurts performance in general since for the majority of the read cycles, the extra time is wasted since an error does not need to be corrected. This results in a slow down in row access time or random access speed.
A second disadvantage of the prior art is that the wait for an idle cycle may be a wait for an event that does not come for a long time. If the memory location or a memory bank containing the memory location is continually accessed, the wait for an idle cycle may be an extended wait. This extended wait may cause consistency problems, especially if an additional error is detected prior to the correction of the previous error.
A third disadvantage of the prior art is that the error detection performed during a read cycle only detects errors in memory locations that are read. If a memory location is not read, then no error detection occurs for that particular memory location. Therefore, large sections of the memory space may not be checked. With memory locations going unchecked, errors may accumulate (perhaps due to soft errors) to a point where the error detecting/correcting mechanism can no long detect and/or correct the errors.