Many fault tolerant systems, up to now, have been built upon so called fault-tolerant frameworks on which general properties are proved and then installed. Such frameworks may be the basis for nuclear plants, trains or airplane control.
Such frameworks are not scalable or flexible and are very expensive because they rely on a high level of hardware redundancy and have hardware prerequisites, for instance a dedicated bus driver or other components, (in particular verified micro-controllers with preexisting pieces of software). They are not adapted for large series production where cost optimization is a major issue.
Attempts are being made to realize virtual prototyping, one example of which [SCHEID02] is embodied in the approach referred to as “Systems Engineering for Time Triggered Architectures” (SETTA). This can be found via the URL: “http://www.setta.org”, one of whose publications is by Ch. Scheidler et al: “Systems Engineering for Time triggered Architectures, Deliverable D7.3, Final Document, version 1.0”, XP-002264808, 18 Apr. 2002.
The time-triggered protocol (TTP) framework [Kop96] is a good example of a safety framework built for embedded electronics applications. It answers to a certain extent the flexibility and scalability mentioned above, but only at the level of communication between nodes.
In all the examples above there is a common point,: in that a general safety critical framework is set and the design of an application must be made within the framework and under the specific rules of the framework. The safety proofs are achieved for the whole framework and not for a particular instance of the framework. For instance, in the TTP framework, at least four nodes are required for “normal”1 behavior of the system, and mapping four instances of a process on the different TTP nodes will guarantee that the results of these processes will be available in time and correct for the consumers of these processes. The idea is that a general proof exists for the physical architecture and that this proof specializes for the many instances of safety dataflow and functions embedded in the system.
To give another idea, there is a citation in [Rush95] describing a project in which a safety critical framework, SIFT, has been designed:
“In the SIFT project, several independent computing channels, each having their own processors operate in approximate synchrony; single source data such as sensors are distributed to each channel in a manner that is resistant to Byzantine (i.e. asynchronous) faults, so that a good channel gets exactly the same input data; all channels run the same application tasks on the same data at approximately the same time and the results are submitted to exact-match majority voting before being sent to the actuators”.
This is a good illustration of a safety critical framework. Note however that, in the paragraph below in that publication, the application is not even mentioned. It seems that the framework could be used for a nuclear plant, a space shuttle, or even a coffee machine. So even if the SIFT framework has been built to support a flight control system, the designers wished to design a framework with “good” safety properties on which they could design their safety critical application following fixed replication, communication and voting rules.
In the document “Extending IEC-61508 Reliability Evaluation techniques to Include Common Circuit Designs Used in Industrial Safety Systems”, W. M. Goble et al., the analysis methods described in the IEC-61508 and ANSI/ISA84.01 standards are discussed. The actual effect of particular failures are considered with respect to their effect on the circuit functionality from a safety perspective and indicators of that severity are ascribed. Once assigned, the severity indicators are fixed.
It can therefore be seen that there is a continuing need for improved methods for designing and verifying a safety critical system, which method allows the optimization of a hardware architecture in that system.