Transient event recorders refer to a broad class of systems that provide a method of recording and eventually analyzing signals or events that precede an error or failure condition in logic, electronic, and electro-mechanical systems. Analog transient recorders have existed for years in the form of storage oscilloscopes and strip chart recorders. With the advent of low cost high speed digital systems and the availability of high speed memory, it became possible to record digitized analog signals or digital signals in a non volatile digital memory. Two problems that have always existed in these transient event recoding systems are the speed of data acquisition and the quality of connection to signals being recorded. Transient event recording systems had to have circuits and recording means that were faster than the signals that were to be recorded and the signal interconnection could not cause distortion or significant interference with desired signals.
Digital transient event recording systems have been particularly useful in storing and displaying multiple signal channels where only timing or state information was important and many such transient event recording systems exist commercially. With the advent of very large scale integrated circuits (VLSI), operating at high speeds, it has become very difficult to employ transient event recording techniques using external instrumentation. The signals to be recorded or stored could not be contacted with an external connection without a degradation in performance. To overcome this problem, trace arrays have been integrated on the VLSI chip, along with functional circuits, to facilitate the recording of signals relevant to occurring failures. Another problem that occurs when trying to use transient event recording techniques for VLSI circuits is that the trigger event, which actually began a process leading to a particular failure, sometimes manifests itself many cycles ahead of the observable failure event itself.
For hardware debugging of a logic unit in a VLSI microprocessor, a suitable set of control and/or data signals may be selected from the logic unit and put on a bus called the unit debug bus. The contents of this bus at successive cycles may be saved in a trace array. Since the size of the trace array is usually small, it can save only a few cycles worth of data from the debug bus. Events are defined to indicate when to start and when to stop storing information in the trace array. For example, an event trigger signal may be defined when a debug bus content matches a predetermined bit string “A”. For example, bit string “A” may indicate that a cache write to a given address took place and this may be used to start a tracing (storing data in the trace array). Another content, bit string “B”, may be used to stop storing in the trace array when it matches a content of the debug bus.
In some cases, the fault in the VLSI chip manifests itself at the last few occurrences of an event (for example, one of the last times that a cache write takes place to a given address location, the cache gets corrupted). It may not be known exactly which of these last few occurrences of the event manifested the actual error, but it may be known (or suspected) that the error was due to one of the last occurrences. Sometimes there is no convenient start and stop event for storing in the trace array. Because of this, it is difficult to capture the trace that shows the desired control and data signals for the cycles immediately before the last few occurrences of the events. This maybe especially true if system or VLSI behavior changes from one program run to the next.
The performance of VLSI chips is difficult to analyze, and failures that are transient, with a low repetition rate, are particularly hard to analyze and correct. Analyzing and correcting design problems that manifest themselves as transient failures are further exacerbated by the fact that the event that triggers a particular failure may occur many cycles before the actual transient failure itself. There is, therefore, a need for a method and apparatus for recording those signals that were instrumental in causing the actual transient VLSI chip failure. While the preceding has indicated that a failure normally terminates the trace capturing process, it should be understood that other events of interest may be used in a debug or analysis process.