Today's embedded system applications, such as multimedia, telecommunication, and automotive applications, are made of a mix of software (SW) and hardware (HW) components. These mixed HW/SW applications need to be simulated to verify their functionality and predict their performance before implementing them.
A simulated system is composed of a functional description and an architectural, or platform, description that implements the functionality. The functional description is generally composed of a network of concurrent tasks connected by communication arcs. Each task or process is written in a high-level language such as C, C++, or SystemC code. Along with this functional description, system designers also have to satisfy a set of performance constraints. To satisfy the system performance constraints, which may be found in high performance data processing systems, critical processes are implemented as HW blocks, such as application specific integrated circuits (ASICs) or Coprocessors. Moreover, to reach the desired performance level, the ASICs are pipelined. One of the major trends in system design simulation is to provide models to simulate the performance of a complete application at the highest level of abstraction. The problem here is to find a good trade-off between the accuracy of the simulation results and the processing time it takes to complete the simulation. Unfortunately, simulating a complex application with a good accuracy (cycle true, bit true) by using low-level simulation models, such as RTL models for ASICs, or ISS for CPUs, usually takes too long to really help designers to explore different solutions or debug their application.
To accelerate the simulation time, one solution is to replace the low-level simulation models by their equivalent high-level system models, which have the same functional behavior and equivalent performance behavior but which execute 10 to 100 times faster or more. Some abstract models for each architectural component of a hardware platform have already been developed, mostly for CPUs and DSPs (i.e. compile code on a virtual CPU architecture) or for non-pipelined ASICs (i.e. model latency from inputs to outputs). These models accelerate the simulation. However, none so far is precise enough to model the performance behavior of pipelined ASICs, because based on estimates or on statistic measures.
For example, two conventional approaches may be used to model the performance behavior of an ASIC. First, if the system designer does not have access to an RTL model of his ASIC, he just can compile and run his code on a computer processing system and guess what the delays could be for the ASIC. Second, if the user does have access to the RTL code, which is usually VHDL, Verilog with static timing constraints, or HDL simulation test-bench, he could run HDL simulations using an RTL simulation tool and extract delays by performing statistics on the simulated measurements. However, designers who use these ASIC delay modeling techniques may face several drawbacks: the delay models may not exist, they may be static, they may be specific, or they may not provide accurate measurements.
If the ASIC model does not exist, then the only solution for the system designer is to guess the delay numbers for this ASIC. If the ASIC model exists, then the RTL simulation results can be statically back annotated in a higher model. However, the results rely on a set of low level benchmark simulation tests which can not be exhaustive. Moreover they depend on one application and therefore they are not generic. The delay models may also be too specific. For example, some dynamic statistical models can be deduced from the RTL simulation results (e.g. linear regressions on look-up tables), but such models are usually inaccurate because they are based on specific simulation runs. Furthermore, the delay models may not be accurate enough. The delay models for ASICs usually model delays between inputs and outputs (latency), but do not model the output rate for each output, or throughput. The throughput is a key component of ASIC delays when the ASIC is pipelined. The throughput is very dependent on the ASIC environment, which includes factors such as input rate out and output blocking phenomena, and therefore can hardly be defined by simulating the ASIC in isolation. For example, using an average throughput for a pipeline ASIC can result in a small estimation error at the output of the ASIC; but when propagating this error through all the components of the system, it can lead to an important global estimation error. A quantification of this error is application dependant and thus no general numbers can be given by conventional approaches.