The present invention relates to the field of data processing systems and more particularly to error detection and error location apparatus within data processing systems.
In large data processing systems, the location of the circuits causing errors is a difficult task. One difficulty is that the data changes each cycle of the machine. Once an error is made, the error tends to become propagated to different locations throughout the machine. In each subsequent cycle after the error-causing cycle, the original error frequently causes many more errors. This propagation and proliferation of errors tends to mask the data location which originally caused the error.
One error checking and locating mechanism is described in U.S. Pat. No. 4,132,243, entiled "Data Processing System and Information Scanout Employing Checksums for Error Detection" assigned to same assignee as the present invention.
In that patent, the data processing system includes an instruction-controlled principal apparatus and secondary apparatus for independently addressing and accessing points within the principal apparatus. A check-sum generator generates an actual checksum dependent upon the data values of selected points accessed within the principal apparatus. The particular set of points accessed in controlled by the secondary apparatus. The secondary apparatus stores an expected check sum for comparison with the actual checksum. If a comparison indicates that the actual checksum differs from the expected checksum, a fault is indicated within the set of points used in forming the checksum.
Once a fault has been detected through comparisons of actual and expected check sums, it is possible to further analyze the set of points which entered into the checksum to determine what subset of points is the source of the fault. The set of points or the subset of points accessed to form a checksum is controlled by the secondary apparatus.
While the checksum mechanism of U.S. Pat. No. 4,132,243 has proved very useful, it still has the problem that it requires storage of a large number of expected checksums to reflect the many error-free states of the computer. Furthermore, if improvements and changes to the circuitry and operation of the system mandate that the expected checksums change. Accordingly, keeping track of the expected checksums is somewhat of a burden which is undesirable.
Recent data processing systems have included diagnostic scanout capabilities which help locate errors in data processing systems. One such scanout system is described in U.S. Pat. No. 4,244,019 entitled "Data Processing System Including A Program-Executing Primary System" assigned to the same assignee as the present invention.
U.S. Pat. No. 4,244,019 provides a mechanism for scanout of all designed locations within a data processing system, independently of the normal data paths of that system. This scanout ability is of significant value in locating errors, and each location which has an error can be examined independently. However, the ability to examine thousands of locations within a data processing system does not assist in a quick location of the errors without further information as to which locations may be the cause of the errors. Although the above error checking and locating techniques have proved useful, there is a need for still improved error checking and locating techniques within data processing systems.