The correct operation of the electronic components, typically of the integrated electronic circuits, can be disturbed by the environment in which they exist, for example a natural or artificial radiation environment or an electromagnetic environment. External aggressions create eddy currents by interacting with the component's constituent material. These currents can cause a transient or permanent malfunction of the component and the application used by it.
For a natural radiation environment, these effects, generically known as single event effects, are created by particles. For example, heavy ions and protons in space affect the electronic equipment of satellites and launch vehicles. At lower altitudes, where aeroplanes circulate, the presence of neutrons is noted, which also create single event effects. On the ground, such aggressions can also be found and affect electronic components, whether this is due to particles from the natural environment, radioactive particles present in the casings, problems related to immunity, signal integrity, thermal instabilities and methods. In the following paragraphs, the effects originating from particles will be considered in more detail, however the invention also applies to the same types of effects created by different and varied environments.
In a general manner, different types of single event effects can be distinguished:                transient faults: a transient current created by ionisation causes either a transient current which spreads over the circuit, or a change of one or several electrical states (for the case of a memory or registry). In the previous example, the effect is described as transient as, if the content of the memory or registry is rewritten, the error disappears;        permanent faults requiring an intervention which was not provided for within the normal operating conditions of the application, for example reconfiguring the software or performing an intervention on the power supply (shutdown then restart). Following this intervention, the component operates correctly;        destructive effects leading to the definitive shutdown of the component.        
All of these faults produced in the component do not have an immediate or delayed effect on the application, as the different resources of the component are not necessarily used or solicited at the same time. There is therefore a problem in determining whether a fault produced in the component has a harmful effect on a software application driven by this component, mounted onto an electronic board, or whether the latter is able to overcome this.
In addition, the equipment or system architecture can offer a certain level of protection. The integrated applications therefore include a certain level of tolerance to faults which should be quantified. This quantification is not yet available at this time.
A certain number of methods and techniques regarding equipment, operating systems and application software enable an integrated application to be protected with regards to transient and permanent faults. These are called mitigation techniques. The invention relates more specifically to a method enabling application fault tolerance and mitigation techniques regarding the transient and permanent effects which affect logic and analogue electronic components to be assessed and validated.
An electronic component can be made up of, among other examples, a user memory area, a memory area required for its configuration, software resources enabling operations to be performed, resources required for communication between the different logic blocks and resources required for communication between this component and its environment.
The applications based on logic or analogue components have a certain level of tolerance to faults, that is to say that some faults created in the silicon will not have any visible consequences on the application. For example, in the event of a change in the state of a memory cell, if this cell is not used by the application before being rewritten, no error will be produced in the application. In this event, there is therefore an important difference between testing a component, which thus reveals a malfunction, and testing an application, which, under the same conditions, does not malfunction.
Similarly in combinatory logic (for example at the heart of a microprocessor), an eddy current can spread over a series of logic gates and subside and disappear without ever being stored in a registry. However, if all of the applications have a certain level of tolerance to faults, the designer faces the problem of quantifying this tolerance level so as to apply an accurate level of mitigation.
Numerous mitigation methods can be implemented so as to limit, prevent, detect and/or correct the effects which can cause transient faults and permanent faults to the application.
Some methods are thus known, aiming at detecting and/or correcting the faults which can appear in logic circuits so as to prevent failures occurring in the application using the component. Error correction codes can be quoted as an example of this, which enable one or several errors to be detected and corrected. The most complex error correction codes can detect and correct several errors simultaneously. Other mitigation techniques include the periodic rewriting of data or the periodic verification of data susceptible of being corrupted and followed by the rewriting of this data if an error is detected.
There are also methods concerning the board, equipment or system, which will not correct or detect a fault, but prevent it from causing system failures. Redundancy methods can be quoted as an example of this, (this more often refers to triplication) with the voting system. These methods are based on redundancy, either physical redundancy regarding the number of circuits produced, or temporal redundancy, of all or part of the resources performing the application operations. A voting system, positioned upstream and on the supposition that an error appears at the level of one of the duplicated resources, prevents an error from having a consequence on the operation performed by the component or the board.
Multiple errors are also possible and are becoming more and more frequent in new memory technologies. Correcting these errors requires error correction codes to be much further developed (Reed Solomon type error correction codes), which are detrimental to application performance. When possible, methods for physically separating resources are implemented so as to prevent an event from modifying two physically close resources at the same time. Nevertheless, this separation requires perfect knowledge of the logic architecture of the component, which is not always available for the designer.
Finally, the component aside, mitigations can be installed at the level of the operating system, the application software, the electronic equipment architecture and at the upper level of the overall system architecture.
All of the methods previously described can be coupled in such a way as to optimise the level of protection of the component and/or of the operation that it performs.
Nevertheless, the installation of all of these methods is not an easy task, as they are specific to a given component and application. They can be subjected to a production error due to their implementation complexity. In addition, their level of efficiency is not necessarily known in advance. Indeed, according to certain technological parameters and in particular to the component's logic architecture, some mitigation techniques reveal themselves to be inefficient in the event of multiple errors. Although, due to the integration of electronic components, multiple errors, due to interaction with a single particle, are becoming more and more frequent. The efficiency of mitigation techniques implemented for the component, equipment and system must thus be assessed. On the other hand, the specific use of a component by a given application can make an otherwise validated mitigation technique inefficient.
In document PCT/US2004/022531, a system is known, based on a pulsed laser focused on the surface of an electronic component for injecting faults into this electronic component and observing the reaction in its voltage and/or power supply. However, in this document, the component tested is not in the actual situation of running an application. In addition, in order to avoid subjecting the component to sustained aggression, this document provides for synchronising the aggression. Finally, to ensure detecting the effects identified above, sustained impulse times of at least more than one microsecond are provided for. The measurements are therefore not realistic.