The present invention relates to a computer system and method for a computer diagnostic. More particularly, the present invention relates to a computer system and method for diagnosing and isolating faults in multiple card computer systems.
Typical computer systems use diagnostics to verify proper operation and addressing of internal memory devices. Conventional diagnostics involve well-known diagnostic techniques in which the processor writes data to and reads data from memory. In the most basic implementation, the processor simply writes a pattern of data to a particular memory location and then reads back the data from that location, verifying its value. This procedure is then repeated for different patterns and different memory locations. More complex techniques test differently. For example, one technique writes high bits to an entire memory device, writes low bits to a single location within that memory device, and reads all locations of the memory device to verify that the data write to the single location has not corrupted other memory locations. Other techniques, some including statistical analyses, are well known in the art.
The techniques also work in shared-bus computer systems in which a processor is connected to external cards or devices (e.g., direct memory access (DMA) agents, storage devices, input/output (I/O) ports, communication devices, and other types of separate processors and external cards) via an external bus. The processor writes data to a memory location on an external card via the external bus and then reads back the data from that location, verifying its value.
This technique has a shortcoming, however. If it detects a fault, the technique cannot isolate the failing component. For example, if a processor fails to access a memory location on an external card, either the processor card, the card containing the memory location, or the external bus itself may have failed. To repair the system, a technician must analyze multiple components to find the faulty device. Accordingly, diagnostics capable of fault isolation are highly desirable to facilitate repairs and card replacements, especially in complex environments with numerous external cards. Examples of such environments include conventional and cellular telephony switching systems.