Programmable logic devices (PLDs) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops (DLLs), and so forth.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAs) and Programmable Array Logic (PAL) devices. In some CPLDs, configuration data is stored on-chip in non-volatile memory. In other CPLDs, configuration data is stored off-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.
For all of these programmable logic devices (PLDs), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.
The programming used to configure an FPGA or other PLD is often very complex. It is common to use a modeling system to simulate the operation of the programming to evaluate how a physical FPGA will operate when used in a system, such as a system on a chip (“SoC”). In some systems, a PLD interfaces with or includes functional blocks. For example, an FPGA includes an embedded processor operating at a first clock speed, and an I/O interfacing peripheral and a customized computation peripheral (such as a digital processing or image processing filter) operating at a different clock speed. Multiple simulators are integrated into the modeling system to simulate the different functional blocks. In yet other instances, the PLD devices themselves are used in the simulation as emulators. In this case, a portion of a design physically runs on a PLB device while the rest of the design is simulated by the simulators running on a host PC. A modeling system interface controls the simulation progress of the software simulators or emulation hardware, and exchange simulation data between them when needed.
One the one hand, the number of clock cycles used by a functional blocks to process one input data sample may be vastly different. For example, a microprocessor runs an operating system, and requires thousands of clock cycles to manage the operating system services (e.g., interrupt service routines, data transfer to off-chip devices). In comparison, an adder functional block usually just needs one or a few clock cycles to finish the addition/subtraction of one set of data samples.
On the other hand, the different simulators and the emulation hardware may have significant different simulation speeds. For example, the emulation hardware component can simulate tens of millions of clock cycles per second, while a low-level HDL simulator can only simulate a few kilo clock cycles per second.
Considering the above two factors, the times spent by the simulators integrated inside the modeling system to simulate the IC system are often significantly different.
The functional blocks have precedence relations when processing the input data and need to exchange output data with each other. It is required that the simulators that simulate these functional blocks are synchronized properly during co-simulation. One technique is to use single-step clocking co-simulation.
During single-step clocking co-simulation, a global clock pulse is applied to each simulator or emulation hardware after each simulation step. The functional blocks operate off the global clock pulse, which is usually at least as slow as the slowest simulator clock rate in the co-simulation modeling environment. However, due to the different simulation requirements of the functional blocks, and the different simulation speeds of the integrated simulators, this technique unnecessarily slows the simulators/emulation hardware with fast simulation speeds, and the simulation of functional blocks that require significantly more clock cycles than other functional blocks. Single-step clocking is too slow and may even prove to be impractical for many embedded system developments.