In modern computer systems when data is determined to be erroneous, the error status can be identified with a bit or other indication that is associated with the data. In some systems this indicator is referred to as a “poison” bit. If a memory controller receives write data with a poison indication set, it stores that data in memory together with a set poison status indicator. This data may originate from various locations in a system such as an agent or a processor core/last level cache (LLC) writeback. If the memory controller observes uncorrected error correction coding (ECC) on a read, it may write back a poison signature into that memory location and set a poison indicator before forwarding the read data, and log an uncorrected error in machine check banks.
In some processors, if an uncorrectable ECC error or a poison status is detected on reading a memory location, a fatal machine check error is signaled and the operating system (OS) bug-checks and the system resets. This behavior is undesirable for high-availability consolidated servers running multiple virtual machines, as a single hardware fault can bring down the entire system.