1. Field of the Invention
The invention relates to programmable integrated circuit devices, more particularly to the interconnect structure in a field programmable logic device.
2. Description of the Background Art
Field programmable gate arrays (FPGAs) include logic blocks connectable through a programmable interconnect structure. The interconnect structure typically provides for connecting each logic block to each other logic block. Early FPGAs accomplished this by providing short interconnect segments that could be joined to each other and to input and output terminals of the logic blocks at programmable interconnection points (PIPs). As these FPGAs become larger and more complex, the interconnect structure must also become both larger and more complex. In order to improve speed (performance), direct connections to adjacent logic blocks have been provided, and for transmitting a signal the distance of many logic blocks, longer lines have been provided. In order to save silicon area, less frequent PIPs have been provided. With fewer PIPs present, the routing is less flexible (for the same number of routing lines), but typically faster due to reduced loading. By removing only those PIPs which are least often used, routing flexibility can be minimally affected. Thus, there is a trade-off between performance, silicon area, number of routing lines, and routing flexibility.
Several U.S. Patents show such structures for interconnecting logic blocks in FPGAs. Freeman in U.S. Reissue Pat. No. Re 34,363 describes the first FPGA interconnect structure, and includes short routing segments and flexible connections as well as global lines for signals such as clock signals. Carter in U.S. Pat. No. 4,642,487 shows the addition of direct connections between adjacent logic blocks to the interconnect structure of Freeman. These direct connections provide fast paths between adjacent logic blocks. Greene et al in U.S. Pat. No. 5,073,729 shows a segmented interconnect structure with routing lines of varied lengths. Kean in U.S. Pat. No. 5,469,003 shows a hierarchical interconnect structure having lines of a short length connectable at boundaries to lines of a longer length extending between the boundaries, and larger boundaries with lines of even longer length extending between those boundaries. Kean shows in particular lines the length of one logic block connecting each logic block to the next, lines the length of four logic blocks connectable to each logic block they pass, and lines the length of sixteen logic blocks connectable at the length-four boundaries to the length-four lines but not connectable directly to the logic blocks. In Kean's architecture, adjacent logic blocks in two different hierarchical blocks (i.e., on either side of the boundaries) connect to each other differently than adjacent logic blocks in the same hierarchical block.
Pierce et al in U.S. Pat. No. 5,581,199 shows a tile-based interconnect structure with lines of varying lengths in which each tile in a rectangular array may be identical to each other tile. In the Pierce et al architecture, an interconnect line is part of the output structure of a logic block. Output lines of more than one length extend past other logic block input lines to which the logic block output lines can be connected. All of the above-referenced patents are incorporated herein by reference, and can be reviewed for more understanding of prior art routing structures in FPGAs.
In the interconnect structures described by Freeman and Greene et al, each path is formed by traversing a series of programmably concatenated interconnect lines, i.e., a series of relatively short interconnect lines are programmably connected end to end to form a longer path. The relatively large number of programmable connections on a given signal path introduces delay into the signal path and therefore reduces the performance of the FPGA. Such interconnect structures are called "general interconnect".
The direct connections first described by Carter and included in the architecture of Kean provide fast paths between adjacent logic blocks, but in Carter's structure general interconnect must still be used to traverse the distance between any two blocks that are not adjacent. Therefore, circuits large enough or complex enough to require interconnecting signals between non-adjacent blocks (which frequently occur) must use the general interconnect to make these connections. For short paths, general interconnect is slower than direct interconnect, because general interconnect must be connected through several PIPs, or, if long lines are used, must be buffered to accommodate long or heavily loaded signals, introducing delay. Additionally, it is inefficient in terms of silicon area to use long lines for short paths that may be traversing only a few logic blocks, since the long lines can otherwise be used for longer paths. Further, since software that implements a logic design in an FPGA typically places interconnected logic in close proximity, structures that take advantage of this placement strategy will work well with the software, resulting in shorter compilation times for routing software and more efficient circuit implementations.
Interconnect lines called "quad lines" are included in the XC4000EX FPGAs from Xilinx, Inc., and described on pages 4-32 through 4-37 of the Xilinx 1996 Data Book entitled "The Programmable Logic Data Book", available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which are incorporated herein by reference. (Xilinx, Inc., owner of the copyright, has no objection to copying these and other pages referenced herein but otherwise reserves all copyright rights whatsoever.) However, since each quad line contacts every tile that it traverses, these lines have a large number of PIPs, each of which adds RC delay.
Pierce et al provides fast paths between both adjacent logic blocks and logic blocks several tiles apart. The output lines of the Pierce et al architecture can each drive the inputs of a limited set of other logic blocks. However, the possible destinations are limited to selected logic blocks, and the interconnect lines can only access certain specific inputs of the destination logic blocks.
In each of the prior art structures recited above, each interconnect line has programmable connections to the inputs of other logic blocks. However, in the structures of Freeman, Carter, and Pierce et al, a given logic block input can be driven from either horizontal interconnect lines, or vertical interconnect lines, but not both. An alternative approach is to separate the interconnect lines from the logic block inputs by way of a routing matrix, which gives each interconnect line more flexible access to the logic block inputs. Such an architecture is described in commonly assigned, co-pending U.S. application Ser. No. 08/618,445 entitled "FPGA Architecture With Repeatable Tiles Including Routing Matrices and Logic Matrices" by Tavana et al, which is referenced above and incorporated herein by reference. In the structure of Tavana et al, most interconnect lines entering the tile connect to a routing matrix within the tile, rather than directly to logic block inputs or outputs. Connections between pairs of interconnect lines and between interconnect lines and logic block inputs are made through lines called "tile interconnect lines" that do not leave the tile. The advantage of having an extra interconnect line in a path from the edge of a tile to the logic block in the tile is that the routing matrix is flexible but consumes a relatively small amount of silicon area. A combination of PIPs can allow access from any line entering the tile to any desired input of a destination logic block. Yet the total number of PIPs is smaller than in many other interconnect structures. The disadvantage is that getting on and off the tile interconnect lines inserts a certain amount of delay into the path for each tile traversed. This delay inhibits the fast propagation of signals through the FPGA. Tavana et al have therefore provided long lines connectable to every tile they pass and double-length lines that bypass the tile interconnect lines in one tile. These lines can be used for signals that are traversing one or more tiles without accessing the logic blocks in the traversed tiles.
Kean separates the interconnect lines from the logic block inputs using input multiplexer switches, which provide routing flexibility to the inputs.
Since the slowest signal path between logic blocks typically determines the performance of a circuit, it is advantageous to make the slowest path as fast as possible. One way to accomplish this is to design the interconnect structure such that there is a relatively uniform delay on all signal paths throughout an FPGA. In the above routing structures, a typical distribution of delays on signal paths shows a few signal paths with significantly greater delay than the average. These signal paths are typically those with large "RC trees", i.e., signal paths which traverse a resistor (such as an unbuffered PIP), then have a large capacitance on the destination side of the resistor. An interconnect structure with relatively uniform delay could be better realized if large capacitances on a signal path (e.g., longer interconnect lines) were predictably placed on the source side of the resistor, or as close as possible to the source end of the signal path.
High fanout signals have large capacitance and are often slower than low fanout signals. Prior art routing structures had high-fanout signal routing with relatively large RC delay. An interconnect structure should ideally provide high-fanout signal routing with a delay comparable to that of other signals.
It is therefore desirable to find an interconnect structure that allows: 1) uniformly fast propagation of signals, including high-fanout signals, throughout the FPGA; 2) implementation of localized circuits in non-adjacent as well as adjacent blocks using fast paths; 3) ease of use by software; 4) efficient implementation of commonly used logic functions; and 5) a high degree of routing flexibility per silicon area consumed.
One method of improving the performance of localized circuits is to provide feedback paths from the outputs of a given logic block to the inputs of the same logic block. Such fast feedback paths are useful to speed up combinational logic spanning successive function generators in the same CLE. One such feedback path is implemented in the ORCA.TM. OR2C FPGAs from Lucent Technologies Inc. ("ORCA" is a trademark owned by Lucent Technologies, Inc.) The ORCA logic block is described in pages 2-9 through 2-28 of the Lucent Technologies October 1996 Data Book entitled "Field-Programmable Gate Arrays", available from Microelectronics Group, Lucent Technologies Inc., 555 Union Boulevard, Room 30L-15P-BA, Allentown, Pa. 18103, which are incorporated herein by reference. FIG. 1A shows a simplified diagram of ORCA OR2C logic block 100 with output multiplexer 101. FIG. 1B shows the programmable feedback paths provided for logic block 100 of FIG. 1A. The feedback paths extend from logic block 100 outputs O4, O3, O2, O1, O0 to inputs A4, A3, A2, A1, A0, B4, B3, B2, B1, B0, C0, WD3, WD2, WD1, WD0 of the same logic block. For example, one such feedback path extends from output O0 through output line 102, PIP 103, line 104, buffer 105, line 106, PIP 107, and line 108 to logic block input A0. In the ORCA OR2C device, the outputs of the output multiplexer in the logic block feed back to the logic block inputs.
A feedback path from a Configurable Logic Element (CLE), through an output multiplexer, and back into the CLE through an input multiplexer is incorporated in the XC5200 family of FPGAs from Xilinx, Inc. The XC5200 family feedback path is described in pages 4-192 and 4-193 of the Xilinx 1996 Data Book entitled "The Programmable Logic Data Book", available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which are incorporated herein by reference.
The ORCA OR2C and the XC5200 family have the advantage of added flexibility gained by routing feedback paths through the output multiplexer. However, this approach also has an associated speed penalty caused by the additional delay of passing through the output multiplexer.
Another feedback technique is described in pages 4-32 through 4-37 of the Xilinx 1996 Data Book entitled "The Programmable Logic Data Book", available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which are incorporated herein by reference. This technique is used in the XC4000EX family of FPGAs, as shown in FIG. 27 on page 4-34 of the Xilinx 1996 Data Book. The feedback paths exit the XC4000EX tile on lines labeled "DIRECT" and reenter the tile on lines labeled "FEEDBACK" to complete the fast feedback paths. However, the XC4000EX CLE (labeled "CLB" in FIG. 27) does not include an output multiplexer. (The term "output multiplexer" as used herein means more than two multiplexers each generating a single logic block output, where each multiplexer has as inputs more than two function generator outputs.)
Yet another feedback technique is used in the FLEX 10K.TM. FPGA from Altera Corporation, as disclosed in pages 31-53 of the "FLEX 10K Embedded Programmable Logic Family Data Sheet" from the Altera Digital Library 1996, available from Altera Corporation, 2610 Orchard Parkway, San Jose, Calif. 95134-2020, which are incorporated herein by reference. ("FLEX 10K" is a trademark owned by Altera Corporation.) In the FLEX 10K logic block, eight feedback paths are provided in a logic block with eight 4-input function generators. Therefore, it is impossible for each of the function generator outputs to simultaneously drive one input of each of the function generators. Software mapping of logic into function generators is thus complicated by the need to place logic in a particular function generator having a feedback path to related logic in another particular function generator within the same logic block. When two or more function generators are feeding a single function generator in the same logic block, placement of logic into specific function generators of the FLEX 10K logic block may be required. If a sufficiently large number of PIPs is provided in the FLEX 10K input multiplexer, this limitation can be overcome. However, the Altera solution carries an implicit trade-off between a large number of PIPs (and a resulting larger silicon area) and placement software complexity.