A DRAM (dynamic RAM) is a form of semiconductor random access memory (RAM) which typically stores information in integrated circuits that contain capacitors. The dynamic RAM is called dynamic because data are stored only temporarily and must be continually rewritten or refreshed. The temporary nature of DRAM memory storage is due in part to leakage currents in the capacitive elements of the integrated circuits of which the memory is constructed. While DRAMs must be continually refreshed, the high density and low cost of the DRAMs make them advantageous memory components.
In addition to hard errors, such as physical defects that can occur in the integrated circuits forming the memory, DRAMs are susceptible to soft errors. Soft errors are believed to be caused by alpha particles emitted from within the DRAM packaging that hit cells of DRAM memory or the bit lines. These alpha particles alter the voltage of the cells or of the bit lines when data has been accessed from DRAM. When a memory cell is hit by an alpha particle, the logic level of the cell may change. When a bit line is hit as it is accessing data from memory cell, the change in voltage may be enough that the logic level will be altered as the voltage on the bit line is fed back through a transistor or other logic circuit to the memory cell.
Reports of soft errors in such DRAM memories indicate that 90% of all the soft errors are due to bit line hits, i.e., alpha particles hitting the bit lines when a row of memory is being accessed onto the bit lines. The more frequently a particular row is accessed, the higher the probability is that bits on that row will be corrupted.
Soft errors in DRAMs are particularly significant in diskless CPU systems where the code is stored in DRAMs. Typically the code is loaded into the DRAM memory when the system is booted. Soft errors if uncorrected can lead to failure, necessitating subsequent reboot of the code. If this reboot is to occur over the network then there is an indeterminate amount of time between when a failure occurs and when the software is loaded again and the system is back up and running.
The present invention uses a combination of vertical and horizontal parity error detection and correction. A combination of vertical and horizontal parity error detection and correction could present a problem because vertical parity generation and checking is time-consuming relative to a DRAM access time. This scheme is preferably applied to a smaller block of memory in the DRAM where the data is changed infrequently or never, e.g., where code is stored. The scheme of the present invention provides an environment comprising software, hardware, and a particular DRAM architecture in which a high percentage of soft errors in DRAM memory may be detected and corrected while avoiding the necessity of having to reboot the entire system.