In system engineering usage, a fault may be defined as any physical condition which causes an object to fail to perform in a required manner; thus, a failure is an inability of an object to perform its desired function. Failures are detected by evaluation of test results, i.e. the results of comparing a measurement (itself defined as a sample of a signal of interest) to some predetermined operational limits. The primary objective and challenge of a diagnostic system then is to obtain simultaneous high levels of both the coverage and accuracy of fault detection in a system being diagnosed. In fact, a fault detection (FD) effectiveness parameter can be defined, as the product of fault detection coverage and fault detection accuracy, and is a measure of the diagnostic system's ability to detect all potential faults. It is desirable to simultaneously increase the probability of detecting faults in equipment when a fault exists, while reducing the probability of declaring a fault when one does not exist. Increased fault detection, the ability to detect more faults than a previous capability, can be the result of either increased fault coverage, e.g. the presence of more test points in the same equipment, or greater detection accuracy, e.g. the implementation of better tests or processing. Conversely, decreased fault detection leads to missing more real faults, and is almost never desirable.
A false alarm is defined as a fault indication (by Built-In Test or other monitoring circuitry) where no fault exists. However, the user community extends the definition of false alarm to include activity which does not correct the causative fault; this may be actions such as isolating a fault to the wrong module or the inability to reproduce the fault during maintenance. Both false alarms actions result in maintenance actions which do not correct the actual fault; the user's perception is that the causative fault does not exist. Similarly, detection of a temporary or transient real fault is also considered an error. Consider the fault isolation process as applied to an aircraft: if a real fault is detected while the plane is in flight, but cannot be duplicated during ground maintenance, then the maintenance staff considers that fault to be a false alarm. Such a condition is most often caused by intermittent behavior of the system in use, and due to factors including overheating, part fatigue and corrosion, poor calibration, noise and the like. Since the plane is not stressed in the same manner while on the ground, these temporary real faults either disappear or cannot be duplicated; however, continued use of the unrepaired plane is not always a desirable alternative.
Due to the possibility of serious consequences if faults are not properly diagnosed, there have been many past attempts to provide diagnostic systems with ever increasing levels of fault detection effectiveness. Some systems have tried to increase effectiveness by changing test limits and, in some cases, by checking for repeated results. Changing test measurement limits generates mixed results in a system having measurement variations. Noise in either, or both, of the system-under-test (SUT) and the associated diagnostic system, can cause proper measurements, taken in a correctly operating system, to lie in the region of assessed failures, while similar measurements of a failed system may lie in the region of correct operation.
If it is desired to increase fault detection by tightening test limits (e.g. allowing a smaller measurement variation from the mean value before existence of a fault is declared), then the test threshold must move toward the mean value of the measurement. However, since more noisy measurements lie outside the limit of correct operation, the resulting diagnostic system will declare more false alarms. Conversely, to decrease false alarms by changing only a measurement limit will allow more measurement variation (e.g. allow movement of the test limit farther from the measurement mean value) before declaration of a fault condition occurs. However, use of this fault threshold location decreases the diagnostic system's ability to detect a fault, since a noiseless measurement would have to deviate more from its intended location for correct operation before a fault is detected. Accordingly, a new technique is required to simultaneously increase fault detection while reducing false alarms. This new technique is desirably compatible with new, multi-level, integrated diagnostic systems and also desirably capable of an increased probability of detecting faults while reducing, if not eliminating, false alarms and intermittent real faults. Thus, we desire to provide a new method for fault diagnosis which will substantially reduce or eliminate the effects of intermittent faults, noisy measurements, potential false alarms, and out-of-tolerance conditions, while providing flexibility of system changes and implementation in a standardized architecture (capable of replication and similar usage in different system portions and the like). Finally, the high level of Fault Detection without False Alarms must be made in a timely manner in order to facilitate operator and system response to the failure. It is therefore not acceptable to perform extensive post-processing of a test result if such processing will require use of time beyond the system time constraints.