The present invention relates in general to architectures for multi-programmable logic device systems such as systems used for design verification. More specifically, the present invention relates to architectures for hardware logic emulation systems of electronic systems. The present invention also relates to architectures for systems used to rapidly prototype large digital electronic designs and programmable hardware implementation of algorithms (e.g., computation).
Field Programmable Gate Arrays and other programmable logic chips (collectively referred to herein as FPGAs) are integrated circuits that can be programmed to implement various logical functions and are either one-time programmable or reprogrammable. FPGAs are widely used for implementing digital circuits because they offer moderately high levels of integration and faster implementation than other types of logic devices such as gate arrays and application specific integrated circuits (xe2x80x9cASICsxe2x80x9d). Multi-FPGA systems (referred to herein as xe2x80x9cMFSsxe2x80x9d) are collections of FPGAs joined together by a various interconnection schemes or topologies. MFSs are used when the logic capacity of a single FPGA is insufficient to implement a specific logic design. Reprogrammable FPGAs are used when quickly reprogrammable systems are desired.
The routing architecture of an MFS is the manner in which the FPGAs, fixed wires on the printed circuit boards (xe2x80x9cPCBsxe2x80x9d) and programmable interconnect chips are connected. The choice of the routing architecture used to interconnect the FPGAs has a significant effect on the speed and cost of the system.
MFSs are used in various different technologies, including logic emulation, rapid prototyping, and reconfigurable custom computing machines. Examples of MFSs used for logic emulation, rapid prototyping and reconfigurable computing machines can be seen in U.S. Pat. Nos. 5,036,473, 5,448,496, 5,452,231, 5,109,353 and 5,475,830. The disclosures of U.S. Pat. Nos. 5,036,473 , 5,448,496, 5,452,231, 5,109,353 and 5,475,830 are incorporated herein by reference in their entirety.
Hardware logic emulation is an important application for MFSs. Hardware logic emulation systems map a structural representation (commonly referred to as a netlist) of logic design such as an ASIC or a microprocessor into an MFS. In a hardware emulation system, the logic design is operated at speeds that approach real time, i.e., the speed at which the target system (the system where the actual fabricated integrated circuit will be installed) will operate. Thus, hardware logic emulation systems can emulate logic designs at speeds ranging from hundreds of Kilohertz to a few Megahertz. These speeds are several orders of magnitude faster than software design simulation speeds, which are generally restricted to at most few tens of Hertz.
Thus, hardware emulation allows functional verification of a design in its target operating environment, which includes other hardware and software modules. Many functional errors in the logic design that might not have been detected using traditional design verification methods such as software simulation can be discovered and fixed prior to fabrication of the actual integrated circuit. This is due to software simulation""s long execution times. Thus very costly iterations in integrated circuit fabrication can be avoided. Reducing or eliminating the number of iterations results in reduced design costs and faster time-to-market, which are crucial in today""s competitive technology market.
Many MFSs and associated CAD tools have been proposed and built for logic emulation, rapid prototyping and a wide variety of applications in custom computing. Prior art MFSs that have been previously developed range from small systems that fit on a single printed circuit board (PCB) to huge systems that use hundreds of FPGAs laid out on multiple PCBs, which in turn are mounted in many card cages and chassis. Examples of small, single PCB, prototyping systems are the MP3(trademark) and MP4(trademark) prototyping systems available from Aptix Corporation of San Jose, Calif. Examples of large emulation systems are the Mercury(trademark) and System Realizer(trademark) emulation systems available from Quicktum Design Systems, Inc. San Jose, Calif.
Overwhelming majorities of MFSs have been implemented on PCBs. However, a few MFSs based on Multi-Chip Modules (MCMs) have been proposed and built. In these Field-Programmable Multi-Chip Modules (FPMCMs), several FPGA dies are mounted on a surface within the MCM package. Interconnection resources are provided and all the logic and routing resources are packaged as a single unit. The advantages of MCM-based MFSs compared to PCB-based MFSs are reduced size, power consumption and superior speed performance. While the present application specifically refers to PCB-based implementations, the teachings of the present invention are equally applicable to MCM-based MFSs.
The routing architecture of an MFS is defined by the topology used to interconnect the FPGAs. Another distinguishing feature is whether programmable interconnect devices, also called field programmable interconnect devices (xe2x80x9cFPIDsxe2x80x9d), programmable interconnect chips (xe2x80x9cPICsxe2x80x9d) or crossbars by those skilled in the art, are used for connecting the FPGAs. FPIDs can be implemented using interconnect chips, logic devices having sufficient interconnect resources and FPGAs available from such vendors as Xilinx Corporation of San Jose, Calif. and Altera Corporation of San Jose, Calif.
Prior art routing architectures can be categorized roughly in the following three ways: FPGA-only architectures, architectures that use only FPIDs for interconnecting FPGAs, and architectures that use both FPGAs and FPIDs for interconnecting FPGAs.
In FPGA-only architectures, only direct hardwired connections between FPGAs are used. Thus, there are no programmable connections through FPIDs because no FPIDs are present. This class of architecture can be further sub-divided into the following three categories: linear arrays, mesh architectures, and graph connected architectures (which as will be seen below, are subdivided into three categories). These different types of FPGA-only architectures will now be described.
An example of an MFS having a linear array architecture can be seen in FIG. 1. In MFSs utilizing a linear array architecture, the FPGAs 10a-10f are arranged in the form of a linear array, which is suitable for one-dimensional systolic processing applications. Thus, each FPGA 10a-10f is directly connected to the FPGA 10a-10f linearly adjacent thereto. Using the example of FIG. 1, FPGA 10b is directly connected to FPGA 10a and FPGA 10c. Linear array architectures have extremely limited routing flexibility and many designs may run out of routing resources and hence cannot be implemented. While the architecture may perform well in certain niche applications, it""s utility as a general purpose MFS is very limited.
An example of a mesh architecture is shown in FIG. 2. In the simplest mesh architecture, the FPGAs 15a-15i are laid out in the form of a two-dimensional array with each FPGA 15a-15i connected to its horizontal and vertical adjacent neighbors. Variations of this basic topology may be used to improve the routability of the architecture such as the torus and 8 -way mesh as shown in FIGS. 3 and 4. The advantages of mesh architectures are simplicity of local interconnections and easy scalability. However, by using FPGAs for interconnecting to other FPGAs, and thereby using resources of the FPGA for both interconnect and logic, the amount of logic that can be implemented within each FPGA is reduced. This leads to poor logic utilization. In addition, the connection delays between widely separated FPGAs (especially in arrays comprising large numbers of FPGAs) are large whereas those between adjacent FPGAs are small. Such irregular timing characteristics results in poor speed performance and timing problems such as setup and hold time violations due to widely variable interconnection delays. Notable examples of MFSs utilizing mesh architectures are the RPM(trademark) hardware logic emulation systems from Quickturn Design Systems, Inc., the PeRLe-1 from Digital Equipment Corporation, and the MIT Virtual Wires project.
Other FPGA-only architectures include bipartite graph, tripartite graph and completely connected graph. An example of an MFS utilizing a bipartite graph architecture can be seen in FIG. 5. In an MFS having a bipartite graph topology, the FPGAs are divided into two groups. The FPGAs in one group, FPGAs 20a, 20b, 20c, connect only to each of the FPGAs in the other group, FPGAs 25a, 25b, 25c, which is seen in FIG. 4. The tripartite graph topology (not shown) is similar, except that the FPGAs are divided into three groups and each FPGA in one group connects to every FPGA in the other two groups. In a completely connected graph topology (not shown), each FPGA is connected to every other FPGA.
A major problem with FPGA-only architectures is that they use up excessive FPGAs pins for routing high fan out inter-FPGA nets, which could cause severe routability problems. This will be discussed below in more detail.
Architectures that Employ only FPIDs for Inter-FPGA Routing In the following architectures, all the inter-FPGA connections are realized using FPIDs. In theory, an ideal architecture would be a full crossbar that uses a single FPID for connecting all FPGAs. Unfortunately, full crossbar architectures are not practical for anything other than extremely simple systems. The reason for this is that the complexity (i.e., the hardware resources required within a crossbar chip) of a full crossbar grows as a square of its pin count and hence it is restricted to systems that contain at most a few FPGAs. The partial crossbar architecture disclosed in U.S. Pat. No. 5,036,473 overcomes the limitations of the full crossbar by using a plurality of smaller crossbars. A representative example of an MFS using a partial crossbar architecture is seen in FIG. 6. In the example shown in FIG. 6, the MFS uses four FPGAs 30a-30d and three FPIDs 35a-35c. The pins in each FPGA are divided into N subsets, where N is the number of FPIDs in the architecture. All the pins belonging to the same subset in different FPGAs are connected to a single FPID. The delay for any inter-FPGA connection is uniform and is equal to the delay through one FPID. The size of the FPIDs (determined by pin count) increases only linearly as a fraction of the number of FPGAs.
The number of pins per subset (Pt) is a key architectural parameter that determines the number of FPIDs (Ns) needed and the pin count of each FPID (Ps). Given the values of the number of pins per subset (Pt), the number of FPGAs (Nf) in the partial crossbar and the number of I/O pins in each FPGA (Pf), Ns and Ps are given by the following formulas:       N    s    =            P      f              P      t      
The extremes of the partial crossbar architecture are illustrated in FIGS. 7-8 by considering a system with four XC4013 FPGAs (192 usable I/O pins) available from Xilinx Corporation. When the value of Pt is set at 192, the architecture ceases to be a partial crossbar architecture. Instead, the architecture is that of a full crossbar having one very large FPID 40. In this particular example, the FPID would have to have 768 pins (see FIG. 7). It has long been known that full crossbar architectures are not efficient for large interconnection systems. At the opposite extreme, a Pt value of one will require 192 four-pin FPIDs 50-1 through 50 -192 (see FIG. 8).
The examples shown in FIGS. 7 and 8 are not practical in actual systems. A preferable choice of Pt will result in low cost, low pin count FPIDs. For the above example, a Pt value of twelve will require sixteen 48-pin FPIDs. In preferred embodiments presently contemplated, sixty-four or ninety-six pin FPIDs that are commercially available from vendors such as I-Cube can be used for switching circuit I/O signals between FPGAs.
Experimentation has shown that for implementing actual circuit designs in hardware emulation systems, the routability and speed of the partial crossbar is not affected by the value of Pt selected. Importantly, Pt value flexibility is contingent upon using an intelligent inter-chip router that understands the architecture and routes each inter-FPGA net using only two inter-chip connections to minimize the routing delay. However, one practical constraint that must be considered when designing a hardware emulation system is to avoid using Pt values requiring expensive or even unavailable high pin count FPIDs.
The partial crossbar is a highly efficient architecture that provides excellent routability and reasonably good speed. In fact, the partial crossbar architecture is so efficient and flexible that for certain applications, the partial crossbar architecture provides more routing flexibility than necessary. This high level of routing flexibility has the cost of requiring more FPID pins than might be necessary for a particular implementation. As one example, consider two-terminal net routing in a system utilizing a partial crossbar architecture. All such nets have to use an FPID to connect two FPGAs. If the architecture also had direct hardwired connections between FPGAs (in addition to programmable connections through FPIDs), no FPIDs would be needed for routing certain two-terminal nets, thus saving FPID pins. The direct connections would also be faster than the connections through FPIDs and could be exploited by using them to route delay critical inter-FPGA nets. The prior art, however, does not suggest combining a partial crossbar architecture with direct connections.
In architectures employing both FPGAs and FPIDs for Inter-FPGA routing, both FPGAs and FPIDs are used to provide inter-FPGA routing paths. Examples of systems using this type of architecture are the Altera RIPP10, the Splash 2 and the VCC virtual computer. No systematic study has been performed to evaluate these prior art systems"" effectiveness in implementing actual circuits. Moreover, these prior art topologies are very difficult to scale, i.e., they are not suitable for implementing very large MFSs. Simple scaling of these topologies to implement large MFSs would most likely lead to severe routability and speed problems.
There has been a long felt need for an interconnect architecture that has the flexibility, efficiency, and scalability of the partial crossbar architecture, but with comparable routability, even lower cost, and capable of higher speed.
A new type of multi-FPGA routing architecture is disclosed and claimed that overcomes routability, cost and speed issues associated with existing routing architectures by using a mixture of hardwired and programmable connections and a unique topology for interconnecting FPGAs.
In one aspect of the present invention, the present invention comprises an architecture for a multi-FPGA system comprised of both fixed and programmable interconnections. The fixed interconnections directly interconnect each FPGA of a system to every other FPGA of the system. The programmable interconnections interconnect each FPGA to every other FPGA through a programmable interconnect device.
A structure utilizing the teachings of the present invention comprises of a plurality of reprogrammable logic devices. Each reprogrammable logic devices has configurable logic elements for implementing logical functions. Each reprogrammable logic devices also comprises programmable input/output terminals, which can be reprogrammably connected to selected ones of the configurable logic elements of the reprogrammable logic devices. The structure utilizing the teachings of the present invention also comprises a plurality of reprogrammable interconnect devices. Each of the reprogrammable interconnect devices has input/output terminals and internal circuitry which can be reprogrammably configured to provide interconnections between selected ones of the input/output terminals. The structure utilizing the teachings of the present invention also comprises a first set of fixed electrical conductors which connect a first group of the programmable input/output terminals on the plurality of reprogrammable logic devices to the input/output terminals on the reprogrammable interconnect devices such that each of the reprogrammable interconnect devices is connected to at least one but not all of the programmable input/output terminals of the first group of the programmable input/output terminals on each of the plurality of reprogrammable logic devices. Finally, a structure utilizing the teachings of the present invention comprises a second set of fixed electrical conductors connecting a second group of the programmable input/output terminals on the plurality of reprogrammable logic devices to the second group of input/output terminals on every other of the plurality of reprogrammable logic devices.
In another aspect of the present invention, an MFS using the teachings of the present invention comprises a plurality of reprogrammable logic devices. Each of the plurality of reprogrammable logic devices comprises a plurality of configurable logic elements and each of the plurality of reprogrammable logic devices also comprises programmable input/output terminals which can be programmed to connect to selected ones of the configurable logic elements. Such an MFS utilizing the teachings of the present invention also comprises a plurality of reprogrammable interconnect devices. Each of the reprogrammable interconnect devices has input/output terminals and internal circuitry which can be reprogrammably configured to provide interconnections between selected ones of the input/output terminals on the plurality of reprogrammable interconnect devices. Such an MFS also comprises a first set of fixed electrical conductors connecting a first group of the programmable input/output terminals on the plurality of reprogrammable logic devices to the input/output terminals on the reprogrammable interconnect devices such that each of the reprogrammable interconnect devices is connected to at least one but not all of the programmable input/output terminals of the first group of the programmable input/output terminals on each of the plurality of reprogrammable logic devices. This MFS also comprises a second set of fixed electrical conductors connecting a second group of the programmable input/output terminals on the plurality of reprogrammable logic devices to the second group of input/output terminals on every other of the plurality of reprogrammable logic devices. Finally, this MFS using the teachings of the present invention comprises an interface structure arranged to provide signal paths for signals carrying information to or from designated ones of the configurable logic elements in the reprogrammable logic devices.
In another aspect of the present invention, an MFS utilizing the teachings of the present invention comprises an electrically reconfigurable logic assembly for use in an electrically reconfigurable hardware emulation system. The reconfigurable system can be configured with a circuit design in response to the input of circuit information. The MFS using the teachings of the present invention comprises a plurality of FPGAs. Each of the plurality of FPGAs has internal circuitry which can be reprogrammably configured to provide logic functions. Each of the plurality of FPGAs also has programmable input/output terminals that can be reprogrammably connected to the internal circuitry of the plurality of FPGAs. In this aspect of the invention, the MFS comprises a plurality of FPIDs. Each of the FPIDs have input/output terminals and internal circuitry which can be reprogrammably configured to provide interconnections between selected ones of the input/output terminals. This aspect of the present invention also comprises a first set of fixed electrical conductors connecting a first group of the programmable input/output terminals on the plurality of FPGAs to the input/output terminals on the FPIDs. This interconnection is such that each of the FPIDs is connected to at least one but not all of the programmable input/output terminals of the first group of programmable input/output terminals on each of the plurality of FPGAs. This aspect of the present invention also comprises a second set of fixed electrical conductors connecting a second group of the programmable input/output terminals on the plurality of FPGAs to the second group of input/output terminals on every other of the plurality of FPGAs.
In another aspect of the present invention, an electrically reconfigurable logic board for use in an electrically reconfigurable hardware emulation system is disclosed which comprises a logic board structure comprising a printed circuit board. A plurality of FPGAs are mounted on the logic board structure. Each of the plurality of FPGAs has internal circuitry that can be reprogrammably configured to provide logic functions. Each of the plurality FPGAs also has programmable input/output terminals that can be reprogrammably connected to the internal circuitry within each FPGAs. This aspect of the present invention also comprises a plurality of FPIDs mounted on the logic board structure. Each of the FPIDs has input/output terminals and internal circuitry that can be reprogrammably configured to provide interconnections between selected ones of the input/output terminals. This aspect of the present invention also comprises a first set of fixed electrical conductors connecting a first group of the programmable input/output terminals on the plurality of FPGAs to the input/output terminals on the FPIDs such that each of the FPIDs is connected to at least one but not all of the programmable input/output terminals of the first group of the programmable input/output terminals on each of the plurality of FPGAs. This aspect of the present invention also comprises a second set of fixed electrical conductors connecting a second group of the programmable input/output terminals on the plurality of FPGAs to the second group of input/output terminals on every other of the plurality of FPGAs.
The above and other preferred features of the invention, including various novel details of implementation and combination of elements will now be more particularly described with reference to the accompanying drawings and pointed out in the claims. It will be understood that the particular methods and circuits embodying the invention are shown by way of illustration only and not as limitations of the invention. As will be understood by those skilled in the art, the principles and features of this invention may be employed in various and numerous embodiments without departing from the scope of the invention.