The present invention relates generally to the field of data processing systems, and more specifically to a method, system, and computer program product for testing a memory system. Still more particularly, the present invention relates to a method, system, and computer program product for, and in particular to validation of processor memory operations (i.e., load operations for copying data from main memory to a processor register).
Ensuring the integrity of data processed by a data processing system such as a computer or like electronic device is critical for the reliable operation of such a system. Data integrity is of particular concern, for example, in fault tolerant applications such as servers, databases, scientific computers, and the like, where any errors whatsoever could jeopardize the accuracy of complex operations and/or cause system crashes that affect large numbers of users.
Data integrity issues are a concern, for example, for many solid state memory arrays such as those used as the main working storage repository for a data processing system. Solid state memory arrays are typically implemented using multiple integrated circuit memory devices such as static or dynamic random access memory (SRAM or DRAM) devices, and are controlled via memory controllers typically disposed on separate integrated circuit devices and coupled thereto via a memory bus. Solid state memory arrays may also be used in embedded applications, e.g., as cache memories or buffers on logic circuitry such as a processor chip.
A significant amount of effort has been directed toward detecting and correcting errors in memory devices during power up of a data processing system, as well as during the normal operation of such a system. It is desirable, for example, to enable a data processing system to, whenever possible, detect and correct any errors automatically, without requiring a system administrator or other user to manually perform any repairs. It is also desirable for any such corrections to be performed in such a fashion that the system remains up and running. Often such characteristics are expensive and only available on complex, high performance data processing systems. Furthermore, in many instances, many types of errors go beyond the ability of a conventional system to do anything other than “crash” and require a physical repair before normal device operation can be restored.
Conventional error detection and correction mechanisms for solid state memory devices typically rely on parity bits or checksums to detect inconsistencies in data as it is retrieved from memory. Furthermore, through the use of Error Correcting Codes (ECC's) or other correction algorithms, it is possible to correct some errors, e.g., single-bit errors up to single-device errors, and recreate the proper data.
Despite the advances made in terms of error detection and correction, however, one significant limitation of the aforementioned techniques is that such techniques are not configured to directly verify, immediately after a load operation, whether correct data is stored in a register as a result of that operation. A memory load operation reads a datum from memory, loads it in a processor register, and frequently starts a sequence of operations that depend on the datum loaded. Incorrect data may be written to the processor register if, for example, the processor executes the load operation at the same memory location more than once for the same data value and data at the memory location is not flushed, improperly flushed, overwritten, and/or erroneously modified between the load operations.