1. Technical Field
The present invention relates generally to handling computer system errors, and in particular to handling the scan data which is created after a system error. Still more particularly, the present invention provides an algorithm, method and apparatus for handling elements of the scan data, such as scan rings or trace arrays, whose size exceeds a given maximum size.
2. Description of Related Art
The ability to recover from computer system errors and to detect failing components is crucial to continued operation of the system. Diagnostic codes produced by the operating system can indicate the general area of a problem, but are not always capable of clarifying the exact nature of the problem. While real-time monitoring of internal computer processes is not possible, a “snapshot” of system data can provide critical insights into the process. Therefore, when system errors happen, selected chip data is saved to a portion of memory that is persistent, i.e., retains the data when power to the chip fails. This data can include register contents and critical storage areas, such as scan rings and trace array data created by low level system programs, all of which is saved for analysis. The process of saving this data is called a scan dump, and the data is called scan data.
When a system error is recognized in a computer system, a scan dump routine is invoked. This scan dump routine will create a list of elements to be saved, then proceed through the list. For each element to be saved, a write dump routine in invoked to write the element to non-volatile storage. A header that provides information about the element is also written. Later, when the system has been rebooted, the operating system will retrieve the data so that it can be analyzed.
The header which is produced by the write dump routine is 16 bytes long, with a two byte field giving the size of the scan data element. This limits the maximum size of the element which can be handled to only 64 kB. Several of the elements in the dump, specifically some of the rings which are created by the system, have grown beyond the maximum allowable size, requiring some modification to the program(s) handling this data.
While it is possible to allocate more than two bytes to give the size of a scan data element, enlarging this field would necessitate rewriting portions of numerous programs in different functional areas of the operating system. Moreover, unless the size field is enlarged more than currently necessary, the need for further modification to the programs could be triggered by future increases in size of the elements. Thus, it would be desirable to provide a method of handling these large elements such that future programming changes will not be needed.