Machine check architecture is a technique that may be used in modern computing systems to detect failures, such as failures in memory. In many computing systems, memory is protected by an Error Correcting Code (ECC). Error correcting codes typically in use can detect and correct single bit errors. In the case of a multi bit error, the ECC may be able to detect the error, but is unable to correct the error. In a system utilizing machine check architecture, when a memory location containing a non-correctable error is discovered, a special value, called a poison signature, or simply poison, is placed in the memory location. The poison signature is typically generated by manipulating the ECC bits. The non-correctable error in the data may be identified by the memory controller, CPU cache, IO Devices, or in any number of other elements and any of those elements can generate a poison signature in the data.
A system may be able to run for a long period of time, even in the presence of poison. However, when the central processing unit (CPU) attempts to use a memory location containing poison, a machine check exception may be generated by the processor. The machine check exception may be intercepted by the firmware or operating system of the computing device. In some cases, no corrective action can be taken and the entire system may crash. However, in other cases, the operating system may be able to take recovery action. For example, the operating system may kill the specific process that was using the memory location that was determined to contain poison.