Computer systems generally include one or more processors and a memory system. The memory system often includes multiple levels of memory devices that range from relatively fast and expensive memory to relatively slow and inexpensive memory. One of the first levels of a memory system is referred to as main memory usually comprises some form of Random Access Memory (RAM). In operation, a computer system loads an operating system and one or more applications into the main memory so that they may be executed by the processor(s).
Because the main memory contains the operating system and applications, it can be a critical component of the computer system. Failures that occur in the main memory can cause broader failures to occur in the system and possibly cause the system to crash. As a result, it is generally desirable to detect errors in the main memory before they cause failures.
Memory errors may be detected by writing known information to a memory and then reading the information back to determine whether it is correct. Some memory errors, however, may be pattern sensitive and may only appear in response to selected information patterns being written to the memory. Some diagnostic testing of a memory may occur in response to a computer system being turned on or reset. This type of testing, however, may not detect errors in computer systems that are left on and not reset for extended periods of time.
Although some memory devices include error correction features that work during operation of a computer system, these features typically detect errors only in response to a specific memory location being read. Because many areas of a memory may not be read with regularity, errors that occur in these areas may go undetected until an access to a faulty memory location takes place.
Accordingly, it would be desirable to be able to detect errors in all areas of a main memory of a computer system before the errors cause failures to occur during operation of the system.