Computer simulation of digital hardware systems has become a common technique to reduce the cost and time required for the design of such hardware systems. Simulating digital hardware allows a designer to predict the functioning and performance of the hardware prior to fabricating the hardware. As more and more digital systems incorporate a processor, including a microprocessor, a digital signal processor, or other special purpose computer processor, there has been increased effort to develop a simulation system that includes simulating the hardware and simulating the running of software on a processor that is included in the digital system. Having such a simulation system allows a designer to test the operation of software on the processor before a physical processor is available. Thus, for example, a designer may be able to start designing a system incorporating a new microprocessor before the manufacturer actually releases physical samples of the microprocessor. In addition, a system designer designing an integrated circuit or a system on a printed circuit board that includes a processor can, for example, use the simulation system to test the integrated circuit or printed circuit board implementation, including operation of software on the processor part, and any testing interactions between the processor and the other digital circuit elements of the integrated circuit or board, before the integrated circuit or board is fabricated. This clearly can save time and money.
Such a simulation system is called a co-simulation design system, a co-simulation system, or simply a design system herein, and the environment for operating such a co-simulation system is called a design environment. The processor is called a target processor and the computer system on which the environment operates is called the host computer system. The hardware other than the processor is called digital circuitry. The computer software program that is designed by a user to operate on the target processor is called the user program.
The target processor may be a separate microprocessor with the digital circuitry being external to the microprocessor (e.g., on a printed circuit board or elsewhere in the system), or may be a processor embedded in an application specific integrated circuit (ASIC) or a custom integrated circuit (IC) such as a very large scale integrated (VLSI) device, with the digital circuitry including some components that are part of the ASIC or IC, and other components that are external to the ASIC or IC.
A design environment capable of co-simulation requires 1) the capability of accurately simulating the digital circuitry, including timing, and 2) the capability of accurately simulating on the host processor the running of the user program on the target processor, including the accurate timing of operation of the user program and of any software/hardware interaction. The first requirement is available today in a range of hardware description languages (HDLs) such as Verilog and VHDL, and simulation environments using them. It also is available as a set of constructed libraries and classes that allows the modeling of hardware in a higher-level language such as `C` or `C++.` The second requirement is for a processor simulator using an executable processor model that both accurately simulates the execution of a user program on the target processor, and can interact with the digital circuitry simulation environment. Such a processor simulator should provide timing information, particularly at times of software/hardware interaction, i.e., at the software/hardware interface. A processor model that includes such accurate timing information is called a "quantifiable" model herein.
One known way of providing such processor simulation is to simulate the actual hardware design of the processor. This can be done, for example, by specifying a processor model in a hardware description language (HDL). Such a model is called an architectural hardware model herein, and a processor simulator derived therefrom is called a hardware architecture simulator herein. An architectural hardware model clearly can include all the intricacies of the processor design, and thus is capable of accurate timing. Since it is written in a hardware description language, it may be treated as a hardware device in a hardware simulation environment. The main but great disadvantage of simulating the operation of the processor by simulating the hardware in some HDL is the slow execution speed, typically in the range of 0.1-100 instructions per second.
Another known way of accurately simulating the execution of software on a processor for inclusion in a co-simulation environment is an instruction set simulator (ISS), wherein both the function and the sequencing of the microprocessor is mimicked in software. An instruction set simulator still executes relatively slowly, compared for example to how fast a program would be executing on the target processor. An ISS executes in the range of 1,000 to 50,000 instructions per second depending on the level of timing and operational detail provided by the model.
Both the ISS and the architectural hardware model approaches to simulating software are relatively slow, and users of such environments often express frustration at their inability to run simulations at practical speeds. HDL and ISS microprocessor models limit the number of software cycles that can be properly verified on a hardware-software modeling system; a few thousand per second is all they allow. On the other hand, real systems execute 50-1000 million instructions per second or more. From this arises a disparity of a factor between about 10,000 to 200,000 in performance, so that 3 to 60 hours of simulation may be needed to model 1 second of real-time target processor performance.
One solution to the slow speed of simulating a processor is to use a hardware processor model. This device includes a physical microprocessor and some circuitry for interfacing and interacting with the design environment simulating the digital circuitry. The memory for the target processor is simulated as part of the digital circuitry. Such an approach is fairly expensive. Another limitation is due to having two definitions of time operating on the same simulation system: simulation time of a hardware simulator, and processor time, which is real time for the hardware processor. Correlating these is difficult.
Another solution is to use an emulator as the target processor model. An emulator, like a hardware processor model, is a hardware device, typically the target processor, and usually includes some memory. The emulator is designed to emulate the operation of the microprocessor. Such a processor emulator when it includes memory can execute the user program directly, but again is expensive and may require the development of external circuitry to interact with the hardware simulator simulating the digital circuitry. U.S. Pat. No. 5,838,948 describes an environment that uses an emulator for speeding up the running of a user program in the design environment.
Behavioral processor simulators are known that can run a user program on the host computer system. With such an approach, the functional outcome of the software execution is combined with the outcome of executing the hardware models described, for example, in an HDL. While such processor models can run at more than 100 million instructions per second and have reasonable functionality, they include no timing or architectural precision, for example to accurately simulate the interaction between the digital circuitry and the processor.
One of the requirements for accurately simulating a processor is architectural precision. For example, modem processors include an instruction pipeline that enables the different stages of handling an instruction to be overlapped. For example, a simple modem pipeline may have the following 5 stages: instruction fetch (IF), instruction decode (ID), execute (EX), memory access (MEM) and write back (WB). After the pipeline is filled, the processor is capable of executing instructions five times faster than it would take an individual instruction to complete all five states. However, pipeline hazards are known that cause a pipeline to stall. For example, hazards occur because instructions that are overlapped in execution may require processor resources simultaneously, with insufficient resources available to service all the requirements of the instructions simultaneously. Hazards also may occur when one instruction is dependent on a preceding instruction, and the dependency cannot be satisfied because the instructions overlap in the pipeline. It is desired to be able to accurately simulate the operation of the user program, including taking into account pipeline effects such as hazards. Hardware architecture simulators and instruction set simulators can be specified to include these intricacies, but, as described above, such processor simulators are inherently slow. Thus, there is a need in the art for a processor simulator that can simulate a user program operating on a target processor with reasonable speed. There also is a need in the art for a design system that simulates an electronic system that includes digital circuitry and a target processor having a pipeline, the design system including a processor simulator that can simulate a user program operating on a target processor with reasonable speed. There also is a need in the art for a processor model of a target processor that has a pipeline for use in a design system that simulates an electronic system that includes digital circuitry and the target processor, the model providing for rapid simulation of a user program operating on a target processor and taking into account timing and pipeline effects such as pipeline hazards.
While sometimes it is desired to run a simulation with great precision at a high level of detail, at other times, less detail may suffice, enabling faster execution of the simulation. There therefore is a need in the art for an executable and quantifiable processor model that can be used in a co-simulation system and that models the operation of the target processor at an elected level of detail, including an elected level of detail at the hardware/software interface.
Computer networks are becoming ubiquitous, and it is desired to be able to operate a co-simulation design system on a computer network, with different elements of the design system running on different processors of the computer network to speed execution. Similarly, multiprocessor computers are also becoming commonplace, and it would be desirable to be able to operate a co-simulation design system on a computer network, with different elements running on different processors of the computer network.
Electronic systems nowadays may include more than one target processor. It is therefore desirable to have a co-simulation design system that provides for rapidly simulating such an electronic system, including simulating respective user programs executing on the target processors, such processor simulation providing timing detail that takes into account instruction timing and pipeline effects for target processors that include a pipeline.
Above-mentioned incorporated by reference U.S. patent application Ser. No. 09/430,855 (hereinafter "the Parent Application") describes a method and system for rapidly simulating on a host computer system a target processor executing a user program. The Parent Application describes a processor model for the target processor that operates up to the host processor speed and yet takes into account instruction timing and pipeline effects such as pipeline hazards. The model can be incorporated into a design system that simulates an electronic circuit that includes the target processor and digital circuitry. The Parent Application also describes using more than one such processor models in a design system that simulates an electronic circuit that includes more than one target processor and digital circuitry. A further feature described in the Parent Application is how a user can modify the processor model to include more or less detail.
Above-mentioned incorporated by reference U.S. patent application Ser. No. 09/430,855 describes a design system operating on a host computer system and simulating an electronic system that contains target digital circuitry and a target processor having a pipeline, the design system comprising a hardware simulator simulating the target digital circuitry, a processor simulator simulating the target processor executing a user program by executing the user program substantially on the host computer system, and an interface mechanism that couples the hardware simulator with the processor simulator including passing information between the hardware simulator and the processor simulator. The hardware processor provides a simulation time frame for the design system. In one version, at significant events, including events that require the user program to interact with the target digital circuitry, the operation of the processor simulator is suspended and associated event information is passed from the processor simulator to the hardware simulator. The operation of the processor simulator then is resumed when the hardware simulator processes information and passes an event result back to the processor simulator.
The processor simulator described in the Parent Application accumulates a simulation time delay when operating, the simulation time delay determined using timing information that accounts for instruction timing including pipeline effects. The timing information is determined by an analysis process performed on the user program in accordance to characteristics of the target processor including instruction timing characteristics and pipeline characteristics. Such an analysis process is called a static analysis process herein because the timing information is obtained by analyzing the user program prior to running the analyzed version of the user program on the processor simulator. The static analysis process comprises decomposing the user program into linear blocks of one or more instructions; determining the time delay for each linear block of the user program using characteristics of the target processor; and combining the linear block timing information with the user program to determine the timing information for the processor simulator.
Some timing information is not available by such static analysis. Many modern processors include memory cache to speed up memory accesses. A separate cache, called a data-cache or D-cache, might exist for data access, another cache, called an instruction cache or I-cache, might exist for instruction access. Any timing effects, such as cache misses in a D-cache or an I-cache, are dependent on the current state of the cache, and cannot be known until runtime. Static analysis cannot easily account for such timing.
Thus there still is a need for a design environment that operates on a host computer system which includes a mechanism for rapidly and accurately simulating the operation of a target processor that includes a cache system.