Digital integrated circuits may be subjected to ionizing particles or radiations that generate transient errors in the circuit and disrupt its nominal operation.
In the case of airborne or aerospace applications, digital integrated circuits are exposed to the effects of ionizing radiation originating from the exterior environment. To a lesser extent, even in applications integrated into ground systems, the packages in which the circuits are encapsulated generate alpha radiations that create errors at a rate that increases as the complexity of the circuits and the number of logic gates that they contain increases.
The effects of radiations on a circuit may be cumulative or single-event. In the first case, the defects generated by the radiations gradually accumulate until, above a certain total dose threshold, logic errors are generated. In the second case, a single ionizing particle or a single event may generate an immediate defect that may affect the memories, flip-flops or logic gates that the circuit contains. If the defect generates an inversion of one bit, a single event upset (SEU) is spoken of, whereas, in contrast, if the defect affects a plurality of bits, a multiple-bit upset (MBU) is spoken of. The effects of radiations may take the form of pulses or parasitic signals that propagate through the circuit and affect combinational logic via their presence on an electric wire or a logic gate (a single event transient (SET) is spoken of in this case). These errors (SEU, MBU, SET) are reversible and in combinational and sequential logic affect only the interpretation of electrical signals, the net result being logic errors that affect the circuit.
The problem addressed by the invention is that of protecting a digital circuit from reversible errors produced by radiations, without excessively penalizing the complexity of the logic of the circuit and power dissipation, in a way that is transparent for the service provided or mission accomplished by the circuit, in particular as regards the rhythm of execution of processing operations, and in particular without service interruption.
A plurality of solutions allowing the effects of radiation on a digital integrated circuit to be combated are known.
A first solution consists in using metal shielding to limit the interaction of ionizing radiation or particles with the substrate of the chip of the integrated circuit. For reasons of bulk and weight, the thickness of the shielding must however remain small, this preventing the circuit from being completely protected from transient errors. Thus, this first solution proves to be unsatisfactory.
A second solution consists in using a silicon-on-insulator (SOI) technology instead of raw silicon to produce the integrated circuit. This technology allows transient errors to be decreased by virtue of the use of highly resistive substrates, however it results in a higher manufacturing cost without completely preventing the effects of radiation.
A third known solution is based on the use of a specific library of logic gates that intrinsically incorporates a certain level of redundancy by virtue of a specific design of the logic gates. This solution depends on the integrated-circuit manufacturer and corresponds to one particular technology. In order to make the logic tolerant to parasitic pulses, the design of the logic gates allows for higher margins in the design of the transistors (higher capacitances and slower rise times) and cells for filtering parasitic pulses and redundancies in looped structures. One drawback of this solution is that it is specific to a given type of technology and thus its range of application is limited, and its performance level is less high. Specifically, integrating redundancy into the logic structures increases the amount of space occupied, increases power consumption and lowers execution speed.
Another solution consists in implementing redundancy at the functional level of the circuit, of its architecture. For example, it is known to protect the content of memories using codes for detecting and correcting errors such as the extended Hamming code.
With regard to protecting the logic gates of a circuit, the triple-modular-redundancy (TMR) technique is also known, which allows an error on one instance of a function among a set of three instances of the same function operating in parallel to be corrected. This principle may be applied to a flip-flop or to a combinational logic array or even to a function. It allows sequential logic and combinational logic to be protected from an error occurring on one of the three instances. In contrast, if two errors occur simultaneously on two instances, they cannot be corrected. Moreover, one drawback of this solution is that it is very costly in terms of logic complexity and of power dissipation.
The techniques for detecting and correcting transient errors described in the article “Power consumption improvement with residue code for fault tolerance on SRAM FPGA, Frédéric Amiel et. al, ISEP” are also known. This article presents methods for detecting and correcting transient errors via replication of the function and comparison of the results, and methods for detecting errors via modulo projection of the function and comparison of the results.
The article “Designing fault-tolerant techniques for SRAM-based FPGAs, F. Gusmao de Lima Kastensmidt, IEEE design & test of computers”, which presents the effect of radiation on FPGA SRAM integrated circuits and a conventional mode of protection via triple modular redundancy (TMR), is also known. This article also proposes a technique for protecting combinational logic that is less costly than the TMR method with:
replication of combinational circuits for the purposes of error detection;
modification of these circuits in order to allow, in case of error, the calculation to be replayed in an additional cycle on the basis of post-encoding operands, the result being decoded then compared with the first result in order to identify which instance among the two is erroneous.
The sequential logic remains protected by the TMR method.
This protection technique has the drawback of being too costly in terms of logic resources because of the profound modification of the entire combinational logic and of the use of the TMR method to protect the flip-flops, and of significantly decreasing the speed of the circuits.
The invention aims to solve the limitations of the aforementioned prior-art solutions by providing a solution for protecting a digital integrated circuit that is of low complexity and that allows all the logic resources of a function implemented by a circuit to be protected without interrupting the service and without any impact on the rhythm of execution of the function such as observable from the input and output interfaces.