1. Field of the Invention
The present invention relates generally to hardware of computer systems, and more particularly pertains to a method and system for tracing system operations of a computer system.
The present invention provides a non-obtrusive activity monitor for advantageously monitoring disjunct, concurrent computer operations in a heavily queued computer system. For each active computer operation, the activity monitor uses a hardware implementation of an event-triggered operation graph-monitoring device to trace the path of the computer operation through the computer system. For each operation, a unique signature is generated that records the actual path of the operation and significantly reduces the amount of trace data to be stored. In a preferred embodiment, the trace information is stored together with a time stamp for debugging and measuring queuing effects and timing behavior in the computer system.
2. Discussion of the Prior Art
During the course of operating a computer system, many types of computer operations can occur, for example, pressing of an enter key on a keyboard, loading of data from a disk into memory, or the occurrence of severe errors or further trace events. There can be system internal events or events which are introduced into the system from an external site, by a user of the system, for example. These events form part of the operations of the computer system and encompass a broad variety of operations in a computer system, such as a data transfer from disk to RAM, status requests referring to any devices situated in the computer system, etc. During such an operation, either user data or control data initiated and evaluated by the system pass through one or a plurality of so-called “functional units”, which can be regarded for the purposes of the present invention in abstract generality as elements of the computer system.
With the high integration of computer systems, it has become necessary to integrate hardware debugging functions into the system. During the hardware development phase, debugging is typically performed by simulation. However, after the hardware is available, error analysis is only possible with trace data that is generated by the hardware itself.
The state of the art in tracing generally comprises units that do a fixed selection of possible inputs which is a ‘location centric approach’. E.g. in U.S. Pat. No. 5,355,484 tracing off-chip interfaces, tracing interfaces between functional units and tracing by copying commands and data to arrays are the tools and methods used.
This location centric approach is, however, limited in its practical value because such methods are characterized by the facts that the sequence of operations can be analyzed only with a short history due to trace array limitations; further, the view of a trace is limited to a specific functional unit, i.e., a so-called “isolated view” limits the overall analysis, and finally, tracing is totally independent of error checking.
Today's systems, however, involve many queues and buffers in the data flow, to allow multiple operations to be active at a time. The control logic of this data flow is much more complex, making debugging of the behavior of such a system very difficult. E.g., a concrete example of a chip design can have the complexity of a maximum of 24 outstanding operations at the same time.
To be able to monitor and analyze the behavior of such a computer system, location centric monitoring as it is above is no longer sufficient because the complexity involved by the concurrent running of a plurality of operations hinders the analyzing user of the tracing system from gaining an overall analytical view of the system.