The present invention relates to cycle accurate simulators for use in predicting the performance of processor designs. More particularly, the present invention relates to processor models that can run in a trace driven mode to determine a processor design's performance in a relatively short period of time.
During the development of microprocessors, various designs are proposed and modified. Each design is tested for bugs and for performance (i.e., speed), and modified accordingly to remove bugs and/or improve performance. Ultimately, a design is deemed sufficiently bug-free and fast to be frozen and converted to hardware.
Various software representations of the processor are employed during development. Most importantly, a logical representation of the processor is provided in a hardware design language ("HDL") such as Verilog. This representation is, in fact, an inchoate description of the processor hardware. Ultimately, when the processor design is frozen, the HDL representation is converted to an arrangement of gates capable of implementing the processor logic on a semiconductor chip.
Other software representations of the processor are used to evaluate the performance of HDL designs. One such model is an "architectural model" which contains a relatively high level description of the processor's architecture. Architectural models are commonly used to run standard "benchmark" programs designed to objectively measure the performance of processors. The measures of performance provided by running benchmark programs include, for example, the average number of cycles required to execute an instruction, the rate at which the data cache is accessed, and other performance statistics. Not surprisingly, architectural models are frequently employed during the design process to determine how a particular change to the processor (made to the HDL model) will effect performance. In addition, the performance statistics generated by architectural models may be supplied to potential customers long before the processor design is actually converted to hardware.
While architectural models can run benchmark programs relatively fast, they do not necessarily give highly accurate performance predictions. Modern processors contain many complexities and nuances that can not be completely and accurately modeled by very high level representations such as architectural models. For example, many processors--such as those developed according to the SPARC V9 microprocessor specification--contain branch prediction algorithms, instruction grouping logic for superscalar pipelining, LOAD/STORE cache access rules, etc. that may not modeled with complete accuracy in an architectural model. See "The SPARC Architecture Manual" Version 9, D. Weaver and T. Germond, Editors., Prentice-Hall, Inc., Englewood Cliffs, N. J. (1994), which is incorporated herein by reference for all purposes. Other microprocessor designs may have these and/or other complexities that can not be modeled with complete accuracy by architectural models. Thus, it has been difficult to predict processor performance with very good accuracy during development.
One of the basic shortcomings of architectural models is their inability to accurately model the cycle-by-cycle performance of the processor. Another type of processor model, a "cycle accurate model," contains a sufficiently detailed representation of the processor to maintain cycle-by-cycle correspondence with the actual processor. One such cycle accurate model is described in Poursepan, "The Power PC 603 Microprocessor: Performance, Analysis and Design Tradeoffs", spring Compcon 94, pp. 316-323, IEEE Computer Society Press, 1994. Cycle accurate models find wide use in identifying bugs during processor design verification. For this function, a test sequence of assembly language code is executed on both the HDL representation and the cycle accurate representation of the processor. If any discrepancies are detected in how the two representations handle the test sequence, a bug has likely been found and the HDL representation is accordingly modified to remove the bug.
Cycle accurate models could, in theory, provide an accurate prediction of a processor design's performance by running benchmark programs, but, unfortunately, they are much too slow to run an entire benchmark program (which may require executing several million instructions). Further, cycle accurate models can not provide the resources of an operating system, which are needed to run a benchmark program.
Thus, there exists a need for a processor model that provides accurate performance statistics when running a benchmark program in a reasonably short period of time.