A conventional field programmable gate array ("FPGA") is a programmable logic device that comprises a matrix of logic blocks (LBs), embedded in a configurable interconnect routing network. Control of LB configuration and the routing network define the function of the device. The device is referred to as a "field programmable" device because the array of LBs contained in the device can be configured and interconnected by the user in the user's facility by means of special hardware and software.
FPGAs are well known in the art. For example, U.S. Reissue Pat. No. 34,363 to R. Freeman, entitled "Configurable Electrical Circuit Having Configurable Logic Elements and Configurable Interconnects", assigned to Xilinx, Inc., the assignee of the present invention, describes a configurable logic array that includes a plurality of LBs variably interconnected in response to control signals to perform a selected logic function, and in which a memory is used to store the particular data used to configure the LBs.
An LB may be electrically programmed by control bits to provide any one of a plurality of logic functions. An LB may include the circuit elements necessary to provide an AND gate, flip-flop, latch, inverter, NOR gate, exclusive OR gate, and certain combinations of these functions, or an LB may include a lookup table that offers a user all functions of several input signals. The particular function performed by the LB is determined by control signals that are applied to the LB from a control logic circuit.
A conventional FPGA comprises a plurality of LBs, each LB having input leads and one or more output leads, a general interconnect structure, and a set of programmable interconnection points (PIPs) for connecting the general interconnect structure to each input lead and each output lead. Also, each lead in the general interconnect structure can typically be connected to one or more other interconnect leads by programming an associated PIP.
The various PIPs are typically programmed by loading memory cells that control the gates of pass transistors, or by connecting selected antifuses in an antifuse-based PLD. Currently, a specific FPGA configuration having a desired function is created by configuring each LB and forming paths through the interconnect structure within the FPGA to connect the LBs.
Each PIP in an FPGA is programmed by opening or closing one or more switches associated with the PIP, such that a specified signal path is defined. Such switches may be implemented by applying a control signal to the gate of a pass transistor, or, alternatively, if the switch is part of a multiplexer in which only one of several switches will be turned on at one time, several control signals may be decoded to determine which switch is turned on.
One problem with the known approaches to routing signals through an FPGA interconnect network comes from using many pass transistors to form a path. Since each transistor has an associated impedance, several pass transistors connected in series can introduce a significant impedance into a path. Additionally, each interconnect lead and pass transistor introduces a capacitive element that combines with the impedance to produce a propagation delay over the associated path. Delay is especially pronounced if a long path is required because the path may be implemented through several shorter segments and several pass transistors. There is therefore a need for an FPGA interconnect architecture that avoids the delay of available longer paths composed of a plurality of interconnected shorter paths.
In addition to avoiding long delays and more efficiently utilizing limited device resources, it is desirable to offer predictable delay. The signal path chosen to interconnect one logic element to another logic element is typically governed by algorithms implemented in software routines. The user may exercise some control over the signal paths chosen by the software, but it is typically not practical for the user to control all signal paths in an implemented design. Thus, the software must be entrusted with the significant responsibility for circuit routing and layout, and may choose any of a large number of different interconnect segment and switch combinations to realize a particular signal path. Since the number of interconnect segments and pass transistors will vary from combination to combination, the delay through the signal path may also vary significantly, depending on the choice made by the software. This variation in delay is undesirable. It would therefore be further advantageous to provide an FPGA interconnect structure that did not have significant delay differences depending upon the signal path chosen by the circuit placing and routing software.
One approach to avoiding these complications is the inclusion of direct connect structures between logic elements. Presently available direct connects connect an LB output to an adjacent LB's input, yet have very few PIPs.
For example, in the Xilinx XC3000 FPGA, each LB connects to the four LBs to its north, south, east and west, as illustrated in FIG. 1. The X output may be connected directly to the B input of the LB immediately to its right and the C input of the LB immediately to its left. Similarly, the Y output may be connected directly to the D input of the LB immediately above and the A input of the LB immediately below. Similarly, the Xilinx XC4000EX FPGA includes four direct connects per LB: two vertical and two horizontal. A simplified view of this structure is illustrated in FIG. 2. Horizontal direct connects 4 connect subject LB 2 to adjacent LB 6 on the right, and vertical direct connects 8 connect subject LB 2 to LB 10 adjacent below.
Traditional, non-direct, PIP-based connections (not shown) are also utilized in the XC4000EX FPGA, but are far slower than the available direct connect resources. For example, the delay for a single level of combinational logic using general purpose interconnect is about 2.8 ns. This delay drops to about 1.9 ns (32% faster than general purpose interconnect) when the direct connects of FIG. 2 are used.
An alternative Xilinx architecture is illustrated in FIG. 3 and described by Young, et al. in U.S. patent application Ser. No. 08/806,997 entitled "FPGA Repeatable Interconnect Structure with Hierarchical Interconnect Lines", referenced above and incorporated herein by reference. In this architecture, an LB comprises a configurable logic element (CLE), an input multiplexer (IMUX), and an output multiplexer (OMUX). There are two kinds of direct connects. Conventional direct connects 12 provide a fast path from one LB to an adjacent LB, and fast feedback paths 14 provide a fast path from the output of logic in a CLE through the associated IMUX to other logic within the same CLE. With this structure, the output of a first lookup table (LUT) within a CLE 20 can become the input to another LUT in the same CLE.
Referring still to FIG. 3, there are four horizontal direct-connects 12 driven by each LB, two in each horizontal direction. These direct connects are actually implemented as dedicated connections from an output multiplexer in one LB to input multiplexers in the adjacent LBs. In this architecture, any output of the source CLE 20 can drive any of the LUT inputs of adjacent CLE 22 through direct connects 12. This direct connect structure is more flexible than the direct connect structure in the XC4000EX of FIG. 2, but each direct connect incurs the additional delay of going through an output multiplexer.
In the architecture illustrated in FIG. 3, the advantage of using direct connects is easily revealed. The routing delay for a single level of combinational logic is approximately 2.5 ns when normal, single-length lines (not shown) are used for routing. This drops to about 2 ns (20% faster than the single-length line) when direct connects 12 are used and drops further to about 1.5 ns (40% faster than the single-length line) when fast feedback paths 14 are used.
In a minor variation of this architecture, direct connects are implemented as programmable connections from CLE outputs in one LB to input multiplexers in the adjacent LBs, thereby bypassing the output multiplexer. This implementation has reduced flexibility, but greater speed, compared to the architecture described by Young et al.
While the direct connect architectures illustrated in FIGS. 1-3 provide certain advantages, a variety of factors severely limit actual utilization of these valuable resources and call for the advancement in the art provided by the present invention. For example, previously available device fabrication processes severely limited the amount of metal available for programmable interconnection point (PIP) and direct connect implementation. However, new fabrication processes are increasing the amount of metal "real estate" available for more sophisticated routing structures. There is therefore a benefit from a direct connect interconnect structure that increases device performance while taking advantage of the new fabrication techniques. Moreover, as designs become increasingly hierarchical and contain more highly-structured components, including very tightly coupled data paths having faster local routing needs, more extensive use of direct connects would provide a significant performance enhancement. Also, a direct connect architecture having improved symmetry would be easier to model in placement and routing software than the structures described above.