1. Field of the Present Invention
The present invention is in the field of data processing systems and, more particularly, data processing systems employing error correction in their memory subsystems.
2. History of Related Art
Error code correction (ECC) circuitry is used to detect and correct single bit errors that occur within a data processing system. ECC is most widely implemented in conjunction with main memory subsystems. In systems that employ ECC circuitry, the processor(s) may include dedicated hardware for counting the number of correctable errors detected and for initiating an interrupt procedure in response to an error correction status register. In other processors, however, ECC may be implemented without these dedicated resources. The Opteron® processor from Advanced Micro Devices, for example, integrates a memory controller that uses ECC into the processor but does not incorporate an ECC count register or an ECC status register capable of initiating an interrupt. For purposes of predictive failure analysis to anticipate and prevent significant system failures involving data loss and so forth, it is highly desirable to monitor the number of correctable errors and to take action when the number or pattern of such errors is symptomatic of a more serious condition such as a hard failure or a persistent source of error. It would therefore be desirable to provide a mechanism and method that would enable a processor/system that uses ECC to issue an alert or take other appropriate action based upon the number and source of correctable errors. It would be further desirable if the implemented solution provided a method for reliably determining when a particular location in memory is exhibiting error behavior warranting additional consideration, without compromising system performance by flooding the processor within error correction status queries.