In the nanometer era, Application Specific Integrated Circuit (ASIC)/System on Chip (SoC) complexities in-terms of gate count and operating frequencies are tremendously increasing. For performing prototype/emulation of complex ASIC/SoC, multiple programmable devices (usually Field Programmable Gate Arrays (FPGAs)) are used. This requires partitioning of the ASIC/SoC design. Partitioning of ASIC/SoC design over multiple programmable logic devices (Multi-FPGA) decreases operating frequency compared to a single FPGA, primarily due to resulting combinatorial paths between FPGAs, which most often become the critical paths. Post partitioning reduction in operating frequency due to combinatorial paths between FPGAs is observed due to additional delays introduced between input/output (I/O or IO) pins, FPGA route delays, interconnect board trace delays etc. Also multi-FPGA partitioning usually results in large I/O pin requirement than available physical pins in FPGA and demands time division multiplexing (TDM) of the pins.
High speed ASIC/SoC designs demand higher prototype/emulation system operating frequency and any inefficiency at the I/O pins is not desirable.
In ASIC/SoC designs, for increasing the operating frequency skewed internal clock techniques are commonly used. These techniques are often referred as useful-skew or cycle-stealing techniques. However these techniques are not suitable while emulating logic in programmable logic devices (e.g. FPGAs). This is because of limitations associated with the programmable logic devices. For example, cycle-stealing techniques require a plurality of clock lines, whereas programmable logic device (hereinafter term FPGA or FPGAs is used interchangeably with term programmable device) contain limited number of low skew global clock lines that distribute clock signals to every register in the chip. This limits usability of clock skew technique in FPGAs, as this demands more global clock lines for routing various phase shifted clock signals. The programmable logic devices are pre-layout devices and it may be difficult to introduce a plurality clock lines. In all, programmable logic devices do not provide sufficient clock line resources.
Further implementing cycle-stealing techniques requires a best case timing information along with that of worst case which is difficult in FPGAs. Most of the FPGA vendors will not provide this best case timing information. And even if this best case timing information is provided, they are usually only conservative estimates. The benefits achievable with this cycle-stealing technique also depends on gap between worst and best case delays. The benefits of this cycle-stealing technique will diminish as gap between worst and best case delay is high. Usually the best case delays in FPGA are very low, as low as 25% of that of worst case delays, which makes the emulated logic prone to malfunction because of false signals/glitches. That is, if clock skew technique is applied to FPGAs, hold violations could occur with best case delays and system will be non-functional apart from diminishing or no benefits due to huge worst-best case delay gap. Therefore, it is not preferable to use cycle-stealing technique or skewed clock signaling for improving operating frequency in programmable devices.