1. Field of the Invention
The invention is generally directed to integrated circuits, more specifically to Programmable Logic Devices (PLDs), and even more specifically to a subclass of PLDs known as Field Programmable Gate Arrays (FPGAs).
(A) Ser. No. 08/828,520, now U.S. Pat. No. 5,905,385, filed Apr. 1, 1997 by Bradley A. Sharpe-Geisler and originally entitled, "MEMORY BITS USED TO COUPLE LOOK UP TABLE INPUTS TO FACILITATE INCREASED AVAILABILITY TO ROUTING RESOURCES PARTICULARLY FOR VARIABLE SIZED LOOK UP TABLES FOR A FIELD PROGRAMMABLE GATE ARRAY (FPGA)"; PA1 (B) Ser. No. 08/931,798, filed Sep. 16, 1997 by Bradley A. Sharpe-Geisler and originally entitled, "CIRCUITRY TO PROVIDE FAST CARRY"; PA1 (C) Ser. No. 08/700,616, now U.S. Pat. No. 5,740,069 filed Aug. 16, 1996 by Om Agrawal et al. and entitled, "PROGRAMMABLE LOGIC DEVICE (PLD) HAVING DIRECT CONNECTIONS BETWEEN CONFIGURABLE LOGIC BLOCKS (CLBs) AND CONFIGURABLE INPUT/OUTPUT BLOCKS (IOBs) (AS AMENDED)" (as a continuing divisional with chained cross referencing back to Ser. No. 07/394,221 filed Aug. 15, 1989); PA1 (D) Ser. No. 08/912,763 filed Aug. 18, 1997, by Bradley A. Sharpe-Geisler and originally entitled, "OUTPUT BUFFER FOR MAKING A 2.5 VOLT CIRCUIT COMPATIBLE WITH A 5.0 VOLT CIRCUIT"; PA1 (E) Ser. No. 08/948,306 filed Oct. 9, 1997, by Om Agrawal et al. and originally entitled, "VARIABLE GRAIN ARCHITECTURE FOR FPGA INTEGRATED CIRCUITS"; PA1 (F) Ser. No. 08/966,049 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, "DUAL PORT SRAM MEMORY FOR RUN-TIME USE IN FPGA INTEGRATED CIRCUITS"; PA1 (G) Ser. No. 08/995,615 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, "A PROGRAMMABLE INPUT/OUTPUT BLOCK (IOB) IN FPGA INTEGRATED CIRCUITS"; PA1 (H) Ser. No. 08/995,614, now U.S. Pat. No. 5,982,193, filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, "INPUT/OUTPUT BLOCK (IOB) CONNECTIONS TO MAXL LINES, NOR LINES AND DENDRITES IN FPGA INTEGRATED CIRCUITS"; PA1 (I) Ser. No. 08/995,612, now U.S. Pat. No. 5,990,702, filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, "FLEXIBLE DIRECT CONNECTIONS BETWEEN INPUT/OUTPUT BLOCKs (IOBs) AND VARIABLE GRAIN BLOCKs (VGBs) IN FPGA INTEGRATED CIRCUITS"; PA1 (J) Ser. No. 08/997,221 filed Dec. 22, 1997, by Om Agrawal et al. and originally entitled, "PROGRAMMABLE CONTROL MULTIPLEXING FOR INPUT/OUTPUT BLOCKs (IOBs) IN FPGA INTEGRATED CIRCUITS"; PA1 (K) Ser. No. Not Yet Known filed Dec. 22, 1997, by Bradley Sharpe-Geisler and originally entitled, "MULTIPLE INPUT ZERO POWER AND/NOR GATE FOR USE WITH A FIELD PROGRAMMABLE GATE ARRAY (FPGA)"; and, PA1 (L) Ser. No. 08/996,492 filed Dec. 22, 1997, by Bradley Sharpe-Geisler and originally entitled, "INPUT BUFFER PROVIDING VIRTUAL HYSTERESIS".
2. Description of Related Art
Field-Programmable Logic Devices (FPLDs) have continuously evolved to better serve the unique needs of different end-users. From the time of introduction of simple PLDs such as the Advanced Micro Devices 22V10 Programmable Array Logic device (PAL), the art has branched out in several different directions.
One evolutionary branch of FPLDs has grown along a paradigm known as Complex PLDs or CPLDs. This paradigm is characterized by devices such as the Advanced Micro Devices MACH family. Examples of CPLD circuitry are seen in U.S. Pat. No. 5,015,884 (issued May 14, 1991 to Om P. Agrawal et al.) and U.S. Pat. No. 5,151,623 (issued Sep. 29, 1992 to Om P. Agrawal et al.).
Another evolutionary chain in the art of field programmable logic has branched out along a paradigm known as Field Programmable Gate Arrays or FPGAs. Examples of such devices include the XC2000 and XC3000 families of FPGA devices introduced by Xilinx, Inc. of San Jose, Calif. The architectures of these devices are exemplified in U.S. Pat. Nos. 4,642,487; 4,706,216; 4,713,557; and 4,758,985; each of which is originally assigned to Xilinx, Inc.
An FPGA device can be characterized as an integrated circuit that has four major features as follows.
(1) A user-accessible, configuration-defining memory means, such as SRAM, EPROM, EEPROM, anti-fused, fused, or other, is provided in the FPGA device so as to be at least once-programmable by device users for defining user-provided configuration instructions. Static Random Access Memory or SRAM is of course, a form of reprogrammable memory that can be differently programmed many times. Electrically Erasable and reprogrammable ROM or EEPROM is an example of nonvolatile reprogrammable memory. The configuration-defining memory of an FPGA device can be formed of mixture of different kinds of memory elements if desired (e.g., SRAM and EEPROM). PA0 (2) Input/Output Blocks (IOBs) are provided for interconnecting other internal circuit components of the FPGA device with external circuitry. The IOBs' may have fixed configurations or they may be configurable in accordance with user-provided configuration instructions stored in the configuration-defining memory means. PA0 (3) Configurable Logic Blocks (CLBs) are provided for carrying out user-programmed logic functions as defined by user-provided configuration instructions stored in the configuration-defining memory means. Typically, each of the many CLBs of an FPGA has at least one lookup table (LUT) that is user-configurable to define any desired truth table, --to the extent allowed by the address space of the LUT. Each CLB may have other resources such as LUT input signal pre-processing resources and LUT output signal post-processing resources. Although the term `CLB` was adopted by early pioneers of FPGA technology, it is not uncommon to see other names being given to the repeated portion of the FPGA that carries out user-programmed logic functions. The term, `LAB` is used for example in U.S. Pat. No. 5,260,611 to refer to a repeated unit having a 4-input LUT. PA0 (4) An interconnect network is provided for carrying signal traffic within the FPGA device between various CLBs and/or between various IOBs and/or between various IOBs and CLBS. At least part of the interconnect network is typically configurable so as to allow for programmably-defined routing of signals between various CLBs and/or IOBs in accordance with user-defined routing instructions stored in the configuration-defining memory means. Another part of the interconnect network may be hard wired or nonconfigurable such that it does not allow for programmed definition of the path to be taken by respective signals traveling along such hard wired interconnect. A version of hard wired interconnect wherein a given conductor is dedicatedly connected to be always driven by a particular output driver, is sometimes referred to as `direct connect`.
Modern FPGAs tend to be fairly complex. They typically offer a large spectrum of user-configurable options with respect to how each of many CLBs should be configured, how each of many interconnect resources should be configured, and how each of many IOBs should be configured. Rather than determining with pencil and paper how each of the configurable resources of an FPGA device should be programmed, it is common practice to employ a computer and appropriate FPGA-configuring software to automatically generate the configuration instruction signals that will be supplied to, and that will cause an unprogrammed FPGA to implement a specific design.
FPGA-configuring software typically cycles through a series of phases, referred to commonly as `partitioning`, `placement`, and `routing`. This software is sometimes referred to as a `place and route` program. Alternate names may include, `synthesis, mapping and optimization tools`.
In the partitioning phase, an original circuit design (which is usually relatively large and complex) is divided into smaller chunks, where each chunk is made sufficiently small to be implemented by a single CLB, the single CLB being a yet-unspecified one of the many CLBs that are available in the yet-unprogrammed FPGA device. Differently designed FPGAs can have differently designed CLBs with respective logic-implementing resources. As such, the maximum size of a partitioned chunk can vary in accordance with the specific FPGA device that is designated to implement the original circuit design. The original circuit design can be specified in terms of a gate level description, or in Hardware Descriptor Language (HDL) form or in other suitable form.
After the partitioning phase is carried out, each resulting chunk is virtually positioned into a specific, chunk-implementing CLB of the designated FPGA during a subsequent placement phase.
In the ensuing routing phase, an attempt is made to algorithmically establish connections between the various chunk-implementing CLBs of the FPGA device, using the interconnect resources of the designated FPGA device. The goal is to reconstruct the original circuit design by reconnecting all the partitioned and placed chunks.
If all goes well in the partitioning, placement, and routing phases, the FPGA configuring software will find a workable `solution` comprised of a specific partitioning of the original circuit, a specific set of CLB placements and a specific set of interconnect usage decisions (routings). It can then deem its mission to be complete and it can use the placement and routing results to generate the configuring code that will be used to correspondingly configure the designated FPGA.
In various instances, however, the FPGA configuring software may find that it cannot complete its mission successfully on a first try. It may find, for example that the initially-chosen placement strategy prevents the routing phase from completing successfully. This might occur because signal routing resources have been exhausted in one or more congested parts of the designated FPGA device. Some necessary interconnections may have not been completed through those congested parts. Alternatively, all necessary interconnections may have been completed, but the FPGA configuring software may find that simulation-predicted performance of the resulting circuit (the so-configured FPGA) is below an acceptable threshold. For example, signal propagation time may be too large in a speed-critical part of the FPGA-implemented circuit.
In either case, if the initial partitioning, placement and routing phases do not provide an acceptable solution, the FPGA configuring software will try to modify its initial place and route choices so as to remedy the problem. Typically, the software will make iterative modifications to its initial choices until at least a functional place-and-route strategy is found (one where all necessary connections are completed), and more preferably until a place-and-route strategy is found that brings performance of the FPGA-implemented circuit to a near-optimum point. The latter step is at times referred to as `optimization`. Modifications attempted by the software may include re-partitionings of the original circuit design as well as repeated iterations of the place and route phases.
There are usually a very large number of possible choices in each of the partitioning, placement, and routing phases. FPGA configuring programs typically try to explore a multitude of promising avenues within a finite amount of time to see what effects each partitioning, placement, and routing move may have on the ultimate outcome. This in a way is analogous to how chess-playing machines explore ramifications of each move of each chess piece on the end-game. Even when relatively powerful, high-speed computers are used, it may take the FPGA configuring software a significant amount of time to find a workable solution. Turn around time can take more than 8 hours.
In some instances, even after having spent a large amount of time trying to find a solution for a given FPGA-implementation problem, the FPGA configuring software may fail to come up with a workable solution and the time spent becomes lost turn-around time. It may be that, because of packing inefficiencies, the user has chosen too small an FPGA device for implementing too large of an original circuit.
Another possibility is that the internal architecture of the designated FPGA device does not mesh well with the organization and/or timing requirements of the original circuit design.
Organizations of original circuit designs can include portions that may be described as `random logic` (because they have no generally repeating pattern). The organizations can additionally or alternatively include portions that may be described as `bus oriented` (because they carry out nibble-wide, byte-wide, or word-wide, parallel operations). The organizations can yet further include portions that may be described as `matrix oriented` (because they carry out matrix-like operations such as multiplying two, multidimensional vectors). These are just examples of taxonomical descriptions that may be applied to various design organizations. There may be more. The point is that some FPGA structures may be better suited for implementing random logic while others may be better suited for implementing bus oriented designs or other kinds of designs.
If the FPGA configuring software fails in a first run, the user may choose to try again with a differently-structured FPGA device. The user may alternatively choose to spread the problem out over a larger number of FPGA devices, or even to switch to another circuit implementing strategy such as CPLD or ASIC (where the latter is an Application Specific hardwired design of an IC). Each of these options invariably consumes extra time and can incur more costs than originally planned for.
FPGA device users usually do not want to suffer through such problems. Instead, they typically want to see a fast turnaround time of no more than, say 4 hours, between the time they complete their original circuit design and the time a first-run FPGA is available to implement and physically test that design. FPGA users also usually want the implementing FPGA circuit to provide an optimal emulation of the original design in terms of function packing density, cost, speed, power usage, and so forth irrespective of whether the original design is taxonomically describable generally as `random logic`, or as `bus oriented`, or as a combination of these, or otherwise.
When multiple FPGAs are required to implement a very large original design, high function packing density and efficient use of FPGA internal resources are desired so that implementation costs can be minimized in terms of both the number of FPGAs that will have to be purchased and the amount of printed circuit board space that will be consumed.
Even when only one FPGA is needed to implement a given design, a relatively high function packing density is still desirable because it usually means that performance speed is being optimized due to reduced wire length. It also usually means that a lower cost member of a family of differently sized FPGAs can be selected or that unused resources of the one FPGA can be reserved for future expansion needs.
In summary, end users want the FPGA configuring software to complete its task quickly and to provide an efficiently-packed, high-speed compilation of the functionalities provided by an original circuit design irrespective of the taxonomic organization of the original design.
In the past, it was thought that attainment of these goals was primarily the responsibility of the computer programmers who designed the FPGA configuring software. It has been shown however, that the architecture or topology of the unprogrammed FPGA can play a significant role in determining how well and how quickly the FPGA configuring software completes the partitioning, placement, and routing tasks.
An improved FPGA architecture that helps FPGA configuring software to better reach its goals was disclosed in U.S. Pat. No. 5,212,652, issued May 18, 1993 to Agrawal et al. The improvement provided a symmetrically balanced distribution of logic function resources and routing resources in both horizontal and vertical directions so that placement and routing was not directionally constrained to, for example, a left-to right signal flow orientation. Balanced availability of logic function-implementing resources and signal-routing resources was provided to give the FPGA configuring software more degrees of freedom in each of the partitioning, placement, and routing phases. This increased the likelihood that congestion would be avoided during placement and routing because circuit implementation could be more uniformly distributed instead of being concentrated along a particular direction. It also increased the probability that more efficient solutions would be found in the iterative optimization phases because optimization attempts would not be constrained by pre-existing congestions.
U.S. patent application Ser. No. 08/700,616 now U.S. Pat. No. 5,740,069 (hereinafter "'616 application"), entitled "Programmable Logic Device (PLD) Having Direct Connections Between Configurable Logic Blocks (CLBs) and Configurable Input/Output Blocks (IOBs), filed Aug. 15, 1989 by Agrawal et al., disclosed signal-routing resources, and in particular, direct connections between CLBs. Direct connect outputs and inputs were positioned on all four sides of a CLB. A single direct connect output was positioned on each side of a CLB. Similarly, four direct connect inputs were positioned on each side of a CLB. The positioning of direct connect inputs and outputs on a CLB, as well as positioning of the direct connect lines, enables a symmetrically balanced distribution of direct connect signal routing resources.
Further advances in integrated circuit manufacturing technologies have now enabled higher densities of logic function-implementing circuits and higher densities of signal routing resources. This presents opportunities for further-improvements.