Radiation can affect the operation of electronic devices. Digital circuits operating in a high radiation environment, such as in space, are susceptible to single event effects by radiation exposure. For example, a high-energy particle can cause the state of a digital storage register, or an input to a digital gate, to a change state. This change in state can result in a single event failure of the design, but is not usually destructive. Many systems and processes have been designed and used to overcome the problem of radiation exposure. The different approaches can be grouped into two areas. The semiconductor processes are modified with radiation hardened process modifications such that single event failures are minimized. The use of radiation-hardened processes is costly and time-consuming to implement. Radiation hardened processes typically lag the state-of-the-art commercial processes by almost a decade at times. The solution of semiconductor process modification is known as radiation-hardened by process. A system can also be designed such that single event failures are firstly detected and the effects of single event failures are fixed by program control. The program control solution to single event failures is known as radiation-hardened by design.
Previous designs, that provide radiation-hardened by design fault detection of and recovery from single-event failures, have leveraged redundancy. For example, three identical working processors are integrated on a single board. Each processor is made to execute identical code. An external circuit monitors relevant outputs of the working processors operating synchronously. The external circuit is often made radiation-hardened by process. When one of the working processors undergoes a single event failure, its outputs will differ from the remaining two working processors that are functioning normally. In that case, the monitor circuit can reset the faulty processor and restart the processor to enter the same state of the two remaining working processors. The probability of two working processors, or three working processors will undergo a single event failure simultaneously is very low. This radiation-hardened by design approach is known as triple majority voting. Majority voting redundancy has been used and demonstrated to mitigate single-event failures on single-board computers.
There are a number of difficulties with the triple majority voting technique. Only the external working processor outputs are observable. A single event failure may take a long time to manifest itself externally to be then observable. Also, the monitor circuit has to be redesigned for every new working processor used. The expense is incurred both during the design phase and more importantly in the fabrication of a new set of chips. The monitor circuit typically lacks an ability to access data registers within the working processors so as to determine the internal states of the working processors, making it difficult to quickly reset the working processor to a good working state. Furthermore, additional pins often have to be brought out of the working processor to simplify the monitoring operation. This increase in output pins increases the power consumption.
Conventional single-fault detection systems have used separate respective processors for each working program. In the event of a failure, a monitoring processor checks the result of each of the working processors to determine that each of the working processors have the same current state for indicating that all of the working processors are functioning correctly. When any one of the working processors has a state different than the remaining working processors. The monitoring processor determines which one of the working processors has failed. The monitoring processor determines a failure typically through the voting process. The failed working processor can be restarted and reset to the state of the remaining working processors so as to keep all of the working processors in the same state, while identifying and recovering from single event failures. The monitoring processor is typically a radiation-hardened monitoring processor. By so doing, these working processors can recover from single event failures such as those that randomly occur through radiation exposure. As such, a single-fault detection and correction system is a radiation-hardened system.
The Multithreaded processors are a new class of processors that have been used for transient fault recovery. The application of simultaneously multithreaded processors is for fault recovery for terrestrial based highly reliable servers. Operational processors are now equipped with multithreading processing. Hardware multithreading increases the performance of a processor, such as a system processor, a microprocessor, or a digital signal processor, without increasing the operating frequency. The code compiled for multithreaded processors can explicitly schedule different instructions for different program threads. Multithreaded processors are now commercially available. They offer very high performance and low price and leverage the leading edge commercial fabrication processes. A typical example of a hardware multithreaded digital signal processor is one made by Sandbridge Technologies. Multithreaded processors have been used to execute multiple different program threads for improved performance of a single operation processor. However, the advantages of increased performance and the ability to concurrently execute many different program threads renders the multithreaded processor susceptible to single event failures. These and other disadvantages are solved or reduced using the invention.