Known computer systems are all subjected to hardware- or software-based failures which affect their general operation at random instants. When one of these systems manages critical functions as regards life and property safety, the behaviour of the electronic system in the presence of a failure becomes a deciding factor of the overall reliability perceived by the users. This behaviour will define the resilience class of the system. It is fully dependant on technical choices made while designing the system because it is based on redundancies at the hardware level which always imply some cost. The resilience class is therefore the result of a trade-off between cost minimization and availability maximization.
Several types of solutions have been developed to best meet resilience requirements in terms of hardware. Four classes of hardware component failures and three classes of for software component failures can be taken as significant examples of events affecting the operation of a computer.
The four classes of hardware components examined are memories, input-output circuits, power supplies and cooling components, and processors.