Many single-processor computers undergo a "boot" routine during start-up. The boot routine is controlled by a computer program, and performs such tasks as (1) loading an operating system, (2) checking memory for defects, (3) loading software drivers needed for equipment associated with the computer, and other tasks.
If the processor in the computer is defective, that fact will soon become apparent, because the boot process will fail, since that processor is required to run the boot program. However, in multi-processor computers, a faulty processor will not appear so readily, and, in addition, the presence of faulty processors can create complications.
For example, in a multi-processor computer, the processor assigned to handle booting may be fully operative, yet another processor may be defective. But, unlike the single-processor case, a successfully boot does not indicate operability of all processors within the system.
Further, at least two failure modes are possible in the defective processor. In one mode, the failed processor becomes completely dead: it simply behaves as an open circuit to all its inputs and outputs. In this mode, the processor behaves as if nonexistent, and is not necessarily a danger to the operation of the computer. In the other mode, the processor may act as a short circuit, or, worse, it may actively undertake unwanted processing steps, thereby interfering with the operation of other processors in the computer.
As another example, some computers are equipped with multiple processors of the same type. If the booting routine is assigned to a specific processor exclusively, such as processor number 1 for every boot-up, then a paradox can arise. If processor number 1 fails to complete boot-up, the computer becomes non-usable, despite the fact that other, similar, processors are available, which could handle the boot routine.
These examples illustrate a need for detecting failed processors in multi-processor computers.