In many digital computer systems the detection and correct isolation or "coverage" of failures in the computer is a matter of great concern. This is particularly true in avionic type computer systems such as flight, engine, navigation or weapon control systems where redundant control systems exist and the correct isolation of a fault must be guaranteed with a high probability without regard to the source of failure. Upon detection of a fault one of the redundant systems is immediately selected to "carry" the system. A variety of Built-In-Test (BIT) techniques have been developed to meet such requirements. Notable among these are the Watchdog Timer (WDT) function and processor self-tests.
The WDT function, also known as "ticket punch" or "sanity monitor" is used to monitor correct software operation by requiring periodic updating or resetting of the WDT hardware within a legal time interval known as a window. This WDT function is a "non-specific" monitor which can detect any selected failure that can cause the program to diverge from its correct execution sequence and thereby miss the WDT update window. The particular implementation of a WDT function can sometimes erode its coverage capability. For example, if the WDT window is too large and the WDT can be updated more than one time within the window, the coverage probability for, say, a program looping failure is thereby reduced.
The processor self-test, unlike the WDT, is a very specific test involving a collection of specific "must work" instructions for a given processor. The tests are executed using specific data as inputs and are designed to "exercise" the maximum number of individual gates in the processor. Clearly this is a formidable task even for the simplest microprocessors due to the essentially infinite number of possible machine states. A very large proportion of these must be tested to assure a high degree of coverage.
The coverage provided by processor self-tests is generally very difficult to predict and has been the subject of many studies. See, for example, an article by Thatte, S. M. and J. A. Abraham, "Test Generation for General Microprocessor Architectures," in IEEE Proc. of 1979 International Symposium on Fault-Tolerent Computing, Madison, Wisc., IEEE Computer Society, pp. 203-210, June, 1979. There, a graph-theoretic model for microprocessor architecture is presented which permits the treatment of the organization and instruction set as parameters of test generation procedures. Functional level fault models for the register decoding function, and the instruction decoding and control function are developed independent of the details of implementation. Test generation procedures are presented to detect faults in these functions. Their approach is potentially attractive in a user environment because it suggests the avoidance, to some extent, of the normally enormous amount of computation required to generate test sets for the very large number of gates, flip-flops, and interconnections in LSI circuits such as microprocessors.
In the past, when faced with this task, semiconductor and sometimes system manufacturers have resorted to exhaustive testing of each and every machine state and stuck-at fault condition. However, this approach is unsuitable for providing real time, on line, built-in-test (BIT) coverage of avionic computer systems because of the size of the test.
One of the most important drawbacks of these tests is that they lack an independent, external monitor for the execution and correct completion of these self-tests. In the absence of such a monitor function, such as a WDT, there would be no assurance that the self-test was ever started or successfully completed. The monitoring hardware must be independent of the processor so that the use of the processor under test as a monitor would defeat the purpose of the test.