1. Field of the Invention
The present invention relates to computer processors and, in particular, to methods for transferring state information from an active processor to a spare processor.
2. Description of the Related Art
Computer processors are used for a wide variety of purposes. In a xe2x80x9csparing scheme,xe2x80x9d at least two identical processors and related components such as memory are employed. In such a scheme, one processor is active and the spare (or spares) can be used in the event of failure of the active processor, and switched over to periodically to minimize the chance of dual processor failure. Sparing schemes are typically used in systems that requires a high degree of reliability, such as telephony systems. For example, telephone switching or multiplexing components require a high degree of reliability (ability to tolerate faults, or fault tolerance) and thus may employ a sparing scheme. A sparing scheme with one spare processor is said to be 1:1 protected. When more then one (in general, N) spare processors are used, the active processor is N:1 protected.
In some embodiments, the spare processor is switched over to only when necessary, for example when the active processor fails. In other implementations, a spare processor is switched over to periodically to minimize the chance of failure. For example, if a spare is not switched to until the active processor fails, the system is at risk of a dual processor failure (if the spare has already failed or fails at the same time as the active processor). If a spare processor is switched to periodically, a failed spare can be detected, in which case the spare can be repaired and the active processor can remain active, until the spare is repaired or replaced.
When an active processor transfers the active state to a spare, the active processor becomes the spare and vice-versa, until the next such processor switch. Each processor typically includes internal memory (registers) and external memory such as random access memory (RAM). The external RAM and other devices such as hardware are part of the external memory mapping. An active processor is in a particular state at any given moment in time (clock cycle), depending upon the content of all of the memory associated therewith, including the internal registers, and the state of external RAM and other memory mapped devices. The state may be considered to be the combination of the xe2x80x9cinternalxe2x80x9d state and the xe2x80x9cexternalxe2x80x9d state, where the internal state is that associated with the contents of internal registers, and the external state is that associated with the contents of external RAM and other memory mapped devices. The internal state includes general purpose registers, the program counter, the stack pointer, and related items not addressable by the external memory mapping. The external state includes the content or states of all units mapped by memory, which includes the external RAM as well as various memory-mapped devices of the processor which are written to with data or instructions.
Therefore, when a spare processor is activated, to pick up processing where the previous active processor left off, it must assume or have the identical state which the previous active processor was in just before the switch, or at the moment of the switch. This means that the internal registers and the external RAM of the spare processor need to have the same information as that stored in the corresponding RAM and registers of the previous or current active processor.
So-called xe2x80x9cloosely-coupledxe2x80x9d sparing schemes are sometimes employed. In such schemes, the spare processor is fed the same inputs (e.g., commands and outside stimulus) and thus actually executes the same processing so that its state is equivalent to that of the active processor, in some cases with some delay. In this scheme, it is not necessary to copy the state of the active processor""s internal registers, because the internal state of the spare presumably mimics that of the active processor (with some delay), because it is running the same program(s) on the same stimulus. However, in a loosely-coupled sparing scheme, the spare processors are not always exactly synchronized with the state of the active processor due to time delays and other data mismatching problems. Moreover, additional code must often be written for the spare processor to account for the fact that it does not have actual control of some external devices. In many places in the code, therefore, the spare processor must know that it is not active, which adds complexity to code design.
xe2x80x9cTightly-coupledxe2x80x9d sparing schemes are also sometimes used. In a tightly-coupled sparing scheme, the external RAM of all spare processor are updated each time there is any write to the external RAM of the active processor. Thus, the external RAMs of the spare processors are always in synchronization with the external RAM of the active processor. However, unlike a loosely-coupled sparing scheme, in a tightly-coupled sparing scheme, the spare processors are typically powered off (xe2x80x9casleepxe2x80x9d), in order to save power, for example, and thus the spare processor is not operating on the data and its internal state is thus not the same as that of the active processor. The sleep state is sometimes referred to as the low power stopped state.
Therefore, in such a scheme, in order to switch from the current active processor to the spare processor, the internal registers of the spare processor need to be loaded with the identical information stored in the corresponding internal registers of the current active processor. This transfer of memory contents may be referred to as a state transfer, since the (internal) state of a currently active processor is transferred to the spare processor to be activated. Each processor switch in a tightly-coupled sparing scheme thus requires a state transfer.
Unfortunately, it can be difficult to configure a tightly-coupled sparing scheme to copy the state of the internal registers. The sparing scheme must be encoded as part of whatever overall programs are programmed into the processors. The code developer must take into account several factors to ensure that the state transfer (especially of internal register data) is done properly. For example, depending upon the processor architecture, operating system (OS) characteristics, and nature of other programs run on the processor which are part of the code, registers may have to be saved in a particular order to ensure an accurate state transfer, in part because the registers themselves are used during the state transfer, thus complicating the transfer.
Accordingly, it can be difficult to configure the internal state transfer aspect of the code. Further, this difficult and time-consuming programming task may need to be repeated for each new processor, OS, or program employed.
A computer system comprises an active processor and one or more spare processors. A dummy thread is created on the active processor while the active processor is running a current thread, causing the current thread to become dormant. The dummy thread is a special disposable thread whose only purpose is to hand over control to the spare processor which is to become active. The operating system of the active processor saves a set of internal state information for the current thread into a first memory coupled to the active processor. Each write to the first memory is duplicated in an equivalent spare memory coupled to each of the one or more spare processors. The dummy thread then activates the spare processor and de-activates the active processor. The dummy thread may be discarded without the necessity of saving its internal state information because it is to be discarded after it has performed its sole function of displacing the current thread (causing the current thread""s internal state information to be saved) and handing over control the spare processor.