Programmable logic devices (“PLDs”) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (“FPGA”), conventionally includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (“IOBs”), configurable logic blocks (“CLBs”), dedicated random access memory blocks (“BRAMs”), multipliers, digital signal processing blocks (“DSPs”), processors, clock managers, delay lock loops (“DLLs”), and so forth. As used herein, “include” and “including” mean including without limitation.
Each programmable tile conventionally includes both programmable interconnect and programmable logic. The programmable interconnect conventionally includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (“PIPs”). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic conventionally may be programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external non-volatile memory, such as flash memory or read-only memory) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (“I/O”) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (“PLAs”) and Programmable Array Logic (“PAL”) devices. In CPLDs, configuration data is conventionally stored on-chip in non-volatile memory. In some CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration (“programming”) sequence.
For all of these programmable logic devices (“PLDs”), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell.
Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, e.g., using fuse or antifuse technology. The terms “PLD” and “programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. For example, one type of PLD includes a combination of hard-coded transistor logic and a programmable switch fabric that programmably interconnects the hard-coded transistor logic.
PLDs may include an embedded processor. Even though the example of an FPGA is used, it should be appreciated that other integrated circuits with programmable logic or integrated circuits that are at least partially programmable may be used.
Conventionally, embedded processors are designed apart from the FPGAs. Such embedded processors are thus generally not specifically designed for implementation in FPGAs, and thus such embedded processors may have operating frequencies that significantly exceed a maximum operating frequency of programmable logic. Moreover, parameters such as latency, transistor gate delay, data throughput, and the like designed into the embedded processors may be assumed to be present in the environment to which the embedded processor is coupled.
Performance of a design instantiated in programmable logic of an FPGA (“FPGA fabric”) coupled to an embedded processor may be significantly limited by disparity operating parameters of the FPGA fabric and that of the embedded processor. Thus, if, as before, embedded processor interfaces, such as processor local bus (“PLB”) interfaces, are brought directly out to FPGA fabric, disparity in operating parameters between the embedded processor and the FPGA fabric is a significant limitation with respect to overall performance. So an embedded processor coupled to a design instantiated in FPGA fabric may have to wait on such design instantiated in FPGA fabric, meaning the limiting factor with respect to performance was substantially due to the design instantiated in FPGA fabric. For example, accessing a memory controller instantiated in FPGA fabric coupled to the embedded processor was a significant bottleneck with respect to performance.
Alternatively, a memory controller, previously instantiated in FPGA fabric, may be hardened or provided as an ASIC core coupled to the embedded processor. By hardening a circuit previously instantiated in FPGA fabric, it is generally meant replacing or bypassing configuration memory cells with hardwired or dedicated connections. Additionally, peripherals coupled to the embedded processor may be hardened or provided ASIC cores.
However, ASIC cores, and more generally ASICs, are manufactured for high performance. More particularly, semiconductor processes and semiconductor process integration rules (“semiconductor process design rules”) associated with ASICs, including ASIC cores, are generally more challenging, and thus yield for such ASIC cores may be relatively low as compared to yield of FPGAs of the same size. FPGAs, which may have a larger and longer run rate than ASICs and which may not be as performance driven, may employ semiconductor processing that is more conducive to higher die per wafer yield than ASICs.
It should be appreciated that an FPGA manufactured with an ASIC core uses FPGA semiconductor process design rules. Thus, ASIC cores manufactured in FPGAs perform worse than such ASIC cores manufactured as part of ASICs or as standalone ASICs. Thus, manufacturing FPGAs with hardwired ASIC cores would not achieve competitive performance with standalone ASICs.
Moreover, manufacturing FPGAs with hardened or ASIC core memory controllers or peripherals, or a combination thereof, would reduce flexibility of design of such FPGAs. One significant reason that users purchase FPGAs is the blank slate offered by FPGA fabric for implementing a user created circuit design. If FPGAs come with ASIC cores that take the place of some FPGA fabric resources, users may be both locked into the particular offering of hardened or ASIC core memory controllers or peripherals, and have even less flexibility of design due to fewer FPGA fabric resources for implementing their circuit design. This loss of flexibility combined with the fact that such hardened or ASIC core memory controllers or peripherals implement in FPGA fabric may be significantly slower than their standalone ASIC counterparts, would make FPGAs less attractive to users.
Accordingly, it would be desirable and useful to provide enhance performance of FPGAs without a significant loss of design flexibility.
Heretofore, performance of a design instantiated in programmable logic of an FPGA (“FPGA fabric”) may be coupled to an ASIC core embedded in the host FPGA and the ASIC core having a substantially longer clock insertion delay. It should be understood that an FPGA may include a clock tree, such as an H-clock tree for example, which guarantees timing within specific parameters. However, an ASIC core is not included as part of such a clock tree, and thus conventionally such an ASIC core may have a long clock insertion delay. This clock insertion delay may therefore have to be added to a clock-to-out delay timing parameter for a design employing such an ASIC core. Having such a long clock-to-out delay parameter may inhibit performance. Moreover, in order to avoid violating hold time specifications, the short set-up time which would have been inversely associated with the long clock-to-out delay had to be artificially increased. In other words, set-up times could not be commensurately short in order to avoid having a hold time violation.
As is known, FPGAs may include phase-locked loops (“PLLs”) or delay-locked loops in digital clock managers (“DCMs”). However, such PLLs may not exist as part of an ASIC core, and thus advantageously using such a PLL to reduce clock insertion delay may not be available. Furthermore, adding a PLL to an ASIC core which did not otherwise have a PLL would add significant cost.