The present invention relates to an improvement to a system for testing a data processing unit which is made up of several functional members.
The large number of components used in a computer poses a complex problem when it is a question of detecting and locating faults which may occur in the computer. Two types of possible faults are generally distinguished; namely "permanent" faults which may be detected at any time subsequent to their appearance, and transient so-called "intermittent," faults which appear and disappear at least once during the time when the member or component from which the type of fault in question originates is operating. Fault detection is generally accomplished by means of detectors for detecting functional errors which are located at the outputs of the various functional members of a computer and in particular of a data processing unit. The localization of these detectors being known, the faulty functional member is located by means of information transmitted to an associated test system, which may include a maintenance panel fitted with a display screen. By a better approach of the actual behaviour of the faulty member, by using a test routine for example, it is then possible to locate the group or groups of components which are at the source of the error which has been detected.
Among the intermittent fault which may occur in a data processing unit, the main ones are those due to noise or any other transitory phenomenon such as mechanical vibration. Of functional error detectors, the best known are parity checking devices which supply a bit-combination having a predetermined value when no error exists. After the parity of the data introduced into a functional member and the parity of the data extracted from this same member have been checked, a comparison between the two parities makes it possible to establish whether an error second occured in the member in question. In order to make use of the error detection carried out by means of the parities, various processes may then be used to establish whether the ault is a permanent one and to locate the fault in a functional member or in a component within the member. Certain known processes also allow some intermittent faults to be detected at the functional member level. It is for example possible to cause a processing operation to be repeated in a functional member which has been found to be defective by feeding back the same data into the member in question, so as to see whether the error will re-occur. If the error is detected for a ssecond time the fault is assumed to be permanent and a routine is put in hand. on the other hand, if the error is no longer apparent after the repetition, data continues to be processed in the unit in question. It will be realised that this procedure constitutes only a means of detecting certain intermittent faults, but not one of locating them at the level of the component responsible. Furthermore, the time spent in repeating certain processing operations may be considerable in the case of intermittent faults which recur with great frequency and the time taken in processing may be considerably prolonged by numerous repetitons before an intermittent fault can be considered sufficiently serious for it to be necessary to stop the unit and call in a trouble-shouter. To overcome the drawbacks certain processes consist in establishing whether a functional member is or is not being used during a given processing cycle. If an error is detected in a member in active use an error routine is carried out in the conventional way. If a member is not being used, a test word is fed into it to see whether it is faulty. If it does show a fault, a waiting cycle is initiated so that test data can again be fed to the said member and the continued presence of the fault can be checked for in this way. If the fault is still present, an error routine is carried out. If repetition fails to show up an error, the waiting cycle is cancelled and processing continues. In this way, since the test is carried out when the member is not being used, the error is detected before any data is processed by the said member and this allows any error to be detected before the data is mutilated. This process offers the advantages of advance detection, a particular feature of which is that it allows circuits to be re-structured before any processing takes place in the various members. This prevents the processing unit in question being used despite the fact that it is not working properly. The drawback of this process is that it does not allow certain troublesome intermittent faults to be detected in time. In effect, when a member is checked before it is put into operation, it often happens that an intermittent fault will only make its appearance after the test cycle for the said member has ended, that is to say while processing is under way. In the latter event, a process of this type has the same shortcomings as those mentioned before.
To prevent intermittent faults from causing excessive disruption in a processing operation which is to be carried out, it is important that they should be capable of detection at any time, that is to say, not only before processing is carried out but also while it is going on, and they should also be capable of being located very quickly.
An integrated test system such as that described in U.S. Patent application Ser. No. 450,936 entitled "Testing System for testing a Data Processing Unit," which was filed by the present Applicants on Mar. 13, 1974, makes possible a very swift reaction when a fault occurs while data is being processed and allows the unit in question to be tested both when it is put into operation and also at any other time when an error is detected in the unit. With such a system, the instruction for the test to be carried out may come either from another part of the computer in which the unit to be tested is included (such as from another processing unit for example) or from a maintenance station. With tests automatically controlled and carried out quickly, it is possible to detect many faults which persist for a time which, in comparison with the time to carry out a test, is sufficiently long to enable the fault to be exactly located. When it is intermittent faults that are concerned, it becomes more difficult to detect them and especially to locate them. In effect, an intermittent fault may occur with a little frequency, in which case there is no reason why it should prevent the processing unit from remaining available to a user as long as the fault in question fails to make an appearance. On the other hand an intermittent fault may occur with a great frequency, in which case it is unthinkable to continue with a processing operation which gives false results. Finally, an intermittent fault may make a very brief appearance which, fleeting though it is, may disrupt the processing operation which is underway and falsify its result. In this latter case it is also necessary to be able to identify the fault. It might be thought than an intermittent fault is detectable and can be located when the time for which it is present is at least equal to the duration of the test sequence which needs to be carried out to test the member in question. However this condition is not sufficient. In fact the time at which the fault appears during the test sequence which is carried out to detect and locate it is important also. If, for example, an intermittent fault appears only at the end of a test sequence, even if it is detected, the time for which it manifests itself is too short within the said sequence to provide enough symptoms to locate the fault. Similarly, if an intermittent fault appears after a test sequence has begun, all the symptoms necessary to identify the fault may still not be gathered when the sequence is continued to its conclusion.
One of the objects of the present invention is to enable the testing of at least one functional member of a data processing unit to be initiated immediately after an error is detected when the said unit is processing.
Another object of the invention is to enable tests to be carried out in a data processing unit until a fault makes an appearance.
Another object of the invention is to enable tests performed on at least one functional member of a data processing unit to be repeated.
Another object of the invention is to enable the testing of the various functional members of a data processing unit through respective test sequences each having a time duration which is short in comparison with that of the foreseeable intermittent faults liable to disrupt data processing operations.