For computer systems having security, availability and fault tolerance requirements, redundant channels and components are typically used in order to handle to the aforementioned requirements—in this regard, see also: “Real-Time Systems: Design Principles for Distributed Embedded Applications; H. Kopetz, Kluwer Academic Publishers, 1997.” These redundant channels and components must be implemented in real-time applications typically from a sensor via a computer node up to an actuator. Using them results in significantly higher costs compared to non-redundant systems.
In some applications it is not necessary, because of the application characteristics, to implement each of the channels or components as completely redundant. Brake-by-wire in the automotive field is cited as an example. Because a typical passenger vehicle has at its disposal four braked wheels, the failure of an individual wheel brake can be tolerated. However, it must be ensured in this context that a brake never fails in such a manner that a wheel locks up or is braked too strongly. This would cause the passenger vehicle to become unstable.
In order to be able to safeguard against such an uncontrolled failure of a component, dedicated control links, so-called breaker lines, are used between the actuators and each network node.
For high reliability systems such as brake-by-wire applications, real-time-capable communication systems, which in contrast to event-driven communication protocols (e.g. CAN) offer fault tolerance and minimal fluctuation in the transmission time (latency), are gaining acceptance. TTP/C, for example, one such time-triggered communication system that is already ready for the market and offers, in addition to the reliable transmission of data, a series of higher services such as clock synchronization and membership.
The term membership is to be understood in this context as a service that sends consistent information about the operating state (properly working or defective) of all nodes of the system to all nodes at defined moments—the membership points. The length and the fluctuation of the interval between a membership point and the moment at which the consistent membership information is known to the other nodes are a quality feature of the service parameters of the membership service. A good membership service has a small maximum delay time between the moment of a relevant status change of a node and the moment at which all other nodes have learned of this status change in a consistent way. The term consistent in this connection is to be understood as all network nodes receiving the same information. Thus, for example, in the case of a fault occurring in a network node, all network nodes receive the same fault signal.
The membership service in each communication controller of the network awaits a membership vector in which a bit is assigned to each transmitting node. A transmitter that is labeled in this bit vector as “available” is considered to be in the membership. Precisely one bit in the “membership vector” is statically assigned to each consistent time slot (C-slot) of a TDMA (Time Division Multiple Access) round within the context of the TTP/C.
An essential security advantage of the time-triggered architecture, as it is implemented in TTP/C, is the lack of trigger signals in the communication between communication controller and node controller (processing unit). In this way it can be ensured that a fault cannot be propagated. By strict exclusion of trigger signals in the communication between communication controller and node controller (processing unit), there is in the known systems no possibility for other processing units or for software applications running on them to exert influence on the behavior of a defective processing unit or software application. This fact is especially of relevance then if the defective behavior of a processing unit endangers the security of the overall system. For example, in a “brake-by-wire” system, that is, in a system in which the brakes are controlled purely electronically, a defective software application could block one of the four wheels of an automobile, whereupon the control over the behavior of the entire vehicle in spite of intervention of the remaining wheel computers can become impossible. However, a general transmission of control pulses also presents itself as a source of danger for the security of the overall system. Also, dedicated cutoff lines lead to an increased resource demand and to the introduction of additional error sources.
One such method has become known from the document “Dependable Systems and Networks (DSN 200)”, New York, IEEE Press, p. 5. In this method the transmission of the output signals is carried out via the bus, and the receiver nodes are then indirectly controlled, specifically by arrangement of independent measures of the receiver nodes so that a fault-tolerant direct transmission of trigger signals between the nodes is not possible. The document of Kopetz, H. et al., “Tolerating Arbitrary Node Failures in the Time-Triggered Architecture”, white paper, March 2001, describes the fault tolerance and the possibilities of fault processing in an architecture that used the TTP/C protocol, for example also a resynchronization mechanism in the case of multiple faults. A direct fault-tolerant control of network nodes is likewise not described in this document.
The object of the present invention is to overcome the cited disadvantages of the prior art.
This objective is achieved using a method of the kind mentioned at the outset in that the coordination result is made available as an output signal to one or more hardware outputs of the communication controller and the at least one network node is controlled as a function of this output signal.
Owing to the invention, other network nodes can force a faulty application or processing unit into a specific behavior corresponding to the security design, and, thus, specific fault scenarios of the overall system can be triggered in a controlled manner. Trigger signals for direct triggering of the actuators can be transmitted in a time-triggered architecture taking into account the application-specific security claims, whereby, for example, a node identified as faulty can be switched off by other nodes. Through the coordination mechanism in the fault tolerance layer, a faulty network node can thus be prevented from having an effect on other nodes in the system. In this way it can be ensured that no individual failures lead to a failure of the overall system.