1. Field of the Invention
The present invention relates to the field of pipelined processor computer systems, and more particularly relates to hardware facilities to permit monitoring processor states and characterizing system performance in arbitrary customer workload environments.
2. Art Background
Modern high performance computer systems are typically configured with embedded and external cache memory structures to enhance overall system performance. In computer systems equipped with cache structures, instructions or data residing in single or separate caches reduce or eliminate the need for a time consuming memory references to external memory devices operating on external buses. Unless an instruction or a datum is not resident in the cache when accessed by a processor, no external memory access cycle is required. Moreover, because cache systems are typically implemented using very fast static random access memory (SRAM), overall processor execution speed is greatly improved.
However, as imbedded cache structures become increasingly large, a significant design challenge is encountered wherein monitoring the behavior of a single chip processor (CPU) containing the embedded caches is greatly complicated. Assuming there were perfectly optimized code executing on a cache based single processor CPU, it would be possible for the processor to execute code continuously from within its internal cache stores with no external manifestations of its cycle by cycle progress for significant periods of time. In such a case, system debugging can be very difficult.
To further improve system performance, many computer systems are constructed with pipelined processors, wherein multiple instructions may be simultaneously overlapped during instruction execution, and thereby increasing processor throughput. Traditionally, all instructions and data being processed by the pipeline processor were required to proceed at the same rate, the CPU performance therefore being determined by the slowest pipe stage. However, many pipelined CPU's today permit the various function units to proceed independently and at their own rate. However, as the likelihood for pipeline hazard occurrence is thereby increased, modern pipelined CPU's are typically optimized to reduce likelihood of such occurrences, including synchronization of pipeline stages and tabulation of instruction status to permit scheduling code around such hazards.
Alternatively, once a CPU with embedded caches has been debugged and is installed and operational in a customer workplace, a processor vendor may desire to thereafter monitor and characterize performance of his system in a unobtrusive manner, particularly in situations where the customer is running confidential or proprietary software which he does not wish to disclose to the vendor. A vendor may wish to so characterize his system as installed in the customer workplace for a number of reasons. One major reason would be to determine operational performance bottlenecks in his system executing the proprietary software under real world conditions, so that the system may be reconfigured or memory allocations altered to optimize system performance to the customer workload and requirements. In addition, it is particularly desirable if the results of the system characterization could be tabulated, stored, and later used to overcome similar design constraints and bottlenecks in future product designs, especially as may relate to selected customers for whom systems optimization is of paramount concern. It would further be desirable to be able to monitor first order behavior of the processor, and to make available on a cycle by cycle basis information related to key internal states of the target processor.
As will be obvious, from the following detailed description, these objects and desired results are among the objects and desired results of the present invention, which provides a novel approach to providing an unobtrusive hardware monitor for determination of computer system performance within the customer workplace.
For further description of the pipelining, and descriptions of cache memory systems and cache implementations, see Hennessy & Patterson, "Computer Architecture-A Quantitative Approach", (1990).