1. Field
The present disclosure relates generally to aircraft, and in particular, to fault coverage for multiple failures in redundant systems in aircraft.
2. Background
Redundancy is implemented in many systems in an aircraft to provide a desired level of performance, as well as a desire level of safety. For example, an aircraft flight control system for an aircraft includes flight control surfaces, actuators, valves, servos, controllers, and other components that are utilized to control the flight of the aircraft.
An aircraft flight control system may employ triple redundancy in the data processing architecture. This triple redundancy is employed to perform control and fault detection functions in the aircraft flight control system. In such a system, three individual computing units may perform identical or near identical computations. A computing unit is also referred to as a “lane”. Often, these lanes are expected to generate identical or near identical outputs under normal conditions, and a selection is made from their computational outputs. In parallel, their outputs are typically compared for fault detection and isolation.
With a triple redundant system, “1-Fail Operative” indicates a single failure and “2-Fail Safe” indicates a dual failure. In this context, “1-Fail Operative” means that if one of the three redundant lanes in the system fails, then the system continues to operate and provides the necessary control signals to the two remaining lanes. Continued operation often follows detection and shutdown of the failed lane. This continued operation supports high integrity in a manner that reduces the possibility of an erroneous output and high availability. As a result, the system is able to continue to operate following a single lane failure.
With a triple redundant system, if another lane subsequently fails, then the computing system no longer provides the necessary output to perform a desired function. With this situation, the system may be placed into “2-Fail Safe”, which is a “fail-safe” state in which control outputs from the system are no longer applied or used.
For example, with an aircraft flight control system, “1-Fail Operative” means that an actuator controlled by the system can continue to be controlled following a single lane failure. When the actuator is no longer controllable by the system with a desired level of performance, the system may be placed into a “2-Fail Safe” state in which the system is unable to control the actuator. In this state, “bypass mode” may be employed in which the actuator may be back driven by an air load or by the other actuators on the flight control surface, with low resistance.
Typically, the electronics device implemented in a lane is considered complex. For example, the components for the lane may include a microprocessor, a digital signal processor (DSP), a field programmable gate array (FPGA), or some combination thereof. As a result, all potential modes of undesired operation or the behaviors in which they are expected to fail may be more difficult to predict than desired.
Further, self-declaration of failure by a lane is not considered to have full fault coverage. Therefore, fault detection relies primarily on the comparison between the independent lanes. A first lane failure, such as determining which lane has undesired operation, is relatively simple to detect and isolate. This detection may be accomplished using majority voting.
When undesired operation of a lane is detected, that lane can be shut down by the other two lanes when those lanes agree to the shutdown. The system may continue to operate with the remaining two lanes, thus achieving a “1-Fail Operative” system. A second lane failure may also be addressed in a similar way through comparison between the two remaining lanes.
If at least one of the two remaining lanes decides that the other lane's output differs significantly from its own, the whole system can be shut down or put in an inactive state, such that a “2-Fail Safe” system is achieved. In some cases, a “2-Fail Operative” system in a triple redundant system can be achieved for limited failure cases that result in correct self-declaration.
Fault coverage for the first lane failure is relatively simple because, at the time of the first lane failure, the other two lanes are healthy. The two healthy lanes can be relied on to both agree to vote to shut down the failed lane and keeps that lane shut down, thus providing full fault coverage.
The situation becomes more complex for a situation in which a second lane failure occurs with the remaining two lanes. For example, the second lane fails and that lane votes to bring the first previously failed lane from the shutdown state, such that the first previously failed lane actively participates in the vote. As a result, the two failed lanes may take over control of the system. For example, the two failed lanes may vote to shut down the last remaining healthy lane.
Therefore, it would be desirable to have a method and apparatus that take into account at least some of the issues discussed above, as well as other possible issues. For example, it would be desirable to have a method and apparatus that overcome a technical problem with managing a control system such that a second lane failure is managed to achieve a “fail safe” system that avoids undesired operation of the system.