In computer systems Memory Poisoning refers to the process of storing a special signature in memory to identify bad or corrupted memory data and warn the system when this bad data is eventually read, thereby enabling Enhanced Error Containment and Recovery (EECR). There are several conditions that can give rise to bad memory data, for example:
PCI Express packets with corrupted data received from a PCI Express endpoint performing a direct memory access write operation; or
Cache lines with corrupted data received from the last level cache, e.g., data corruption during the process of write-back operations.
In current implementations, memory poisoning involves storing a special poison signature to identify the poisoned memory data. For example, an implementation could set the data bits all to 0's, and the parity bits all to 1's. In such an implementation, the poisoned data itself doesn't convey any further meaning to the system. Therefore, the main function of memory poisoning in current implementations today is restricted to allowing the memory controller to store corrupted data in memory as a poison that is unusable, such that the memory controller can recognize the presence of the corrupted data on a subsequent access to the data, reject the request and raise an alert to the caller to do the same and/or take appropriate corrective actions. Because such poisoned data does not provide any further information related to the source of the poison or the way this error needs to be handled, the system must rely on other means such as special logging registers (which are expensive to implement in hardware) to track the source of the error, whenever the poisoned memory is eventually read (consumed). This may also involve costly and time consuming procedures like scanning through the entire system hardware to trace the source of the error.