This invention relates to programmable integrated circuit devices. More specifically, the present invention relates to field programmable gate arrays (FPGAs).
An FPGA is a type of programmable logic device (PLD) that can be configured to perform various logic functions. An FPGA includes an array of configurable logic blocks (CLBs) connectable via programmable interconnect structures. For example, a first FPGA, invented by Freeman, is described in U.S. Pat. No. RE34,363. CLBs and interconnect structures in FPGAs are shown in U.S. Pat. No. 5,889,411 issued to Chaudhary et al. and pages 4-32 through 4-37 of the Xilinx 1996 Data Book entitled xe2x80x9cThe Programmable Logic Data Bookxe2x80x9d available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124. The Freeman reference, the Chaudhary reference, and the Data Book are incorporated herein by reference.
In addition to the structures discussed above, FPGAs also include structures for performing special functions. In particular, FPGAs include carry circuits and lines for connecting the carry output of one bit generated in one CLB to the carry input of another CLB, and cascade lines for allowing wide functions to be generated by combining several adjacent CLBs. Carry structures are discussed by Hsieh et al. in U.S. Pat. No. 5,267,187 and by New in U.S. Pat. No. 5,349,250.
Cascade structures are discussed by Goetting et al in U.S. Pat. No. 5,365,125 and Chiang et al. in U.S. Pat. No. 5,357,153. These patents are also incorporated herein by reference. Structures for multiplexing lookup table outputs to form very wide functions are discussed by Bauer and Young in U.S. Pat. No. 6,323,682 (application Ser. No. 09/574,534) also incorporated herein by reference.
As discussed by the above-incorporated references, each CLB may include one or more slices (xe2x80x9cslicexe2x80x9d or xe2x80x9cCLB slicexe2x80x9d). Each slice, in turn, includes at least one configurable function generator. The configurable function generator is typically implemented as a four-input lookup table (LUT). The incorporated references also point out that the carry circuits and cascade structures increase the speed at which the FPGA can perform certain functions, such as arithmetic functions.
FIG. 1A is a simplified block diagram of a conventional CLB 100. The illustrated CLB 100 includes a first slice 110 and a second slice 120. First slice 110 includes a first function generator G 112, a second function generator F 114, a third function generator 116, and an output control block 118. Output control block 118 may include multiplexers, flip-flops, or both. Four independent input terminals are provided to each of the G and F function generators 112 and 114. A single input terminal C1-in is provided to third function generator C1116. Each of function generators 112 and 114 is typically implemented as a four-input LUT, and is capable of implementing any arbitrarily defined Boolean function of the inputs signals. Each of the input terminals may be assigned a number or a letter and referred to as a xe2x80x9cliteral.xe2x80x9d For example, in CLB 100, function generator 112 receives four input signals, or literals, G1, G2, G3, and G4. Function generator 116, typically implemented as a set of configurable multiplexers, is often used to handle carry bits, but can implement some Boolean functions of its three input signals C1-in, Gxe2x80x2, and Fxe2x80x2. These Boolean functions include bypass, inverter, 2-input AND (product), and 2-input OR (sum). Signals Gxe2x80x2, Fxe2x80x2, and C1-out are multiplexed through output control block 118. Output control block 118 provides output signal lines Y, QY, X, and QX. Slice 110 may also provide the carry out signal, C1-out. Second slice 120 is similar to first slice 110. The carry out signal from second slice 120, C2-out, is the carry-in signal C1-in of first slice 110.
Operation of CLB 100 is also described by the incorporated references, and, in particular, in chapters seven and eight of the above-incorporated Data Book. For simplicity, CLB 100 of FIG. 1 is illustrated with two slices; however, the number of slices constituting a CLB is not limited to two.
FIG. 1B is a simplified block diagram of another conventional CLB 100a. CLB 100a is similar to CLB 100 of FIG. 1A but has an additional LUT 113. LUT 113 takes outputs of LUT 112 and 114 as well as another input K1 to slice 110a. Thus, LUT 113 allows slice 110a to implement any arbitrarily defined Boolean function of nine literals G1, G2, G3, G4, F1, F2, F3, F4, and K1. CLB 110a may include additional slices represented by ellipses 120a. 
Technology mapping for LUT-based FPGAs involves decomposition of a circuit into combinational logic having nodes with 4-input (xe2x80x9cfan-inxe2x80x9d) functions that can be realized in the LUTs of CLB slices. This is because, as shown in slice 110, the slices commonly include 4-input LUTs as their function generators. By conventionally specifying the functions of function generators F, G, and Cl, and output control block 118, slice 110 can be programmed to implement various functions including, without limitation, two independent functions of up to four variables each.
Circuit designs are mapped to FPGAs as combinational and sequential logic. The combinational logic may be expressed in Boolean expressions including a number of logic levels and routing between the logic levels. The Boolean expressions include product (logical AND) and sum (logical OR) operations. Two levels of combinational logic may be expressed using sum-of-products (SOP) format. In fact, given a set of inputs and their inverse, any logic equation can be expressed using the SOP format.
In the FPGA art, there is a continuing challenge to increase speed (performance) of FPGA-implemented functions, or circuits. Circuit performance, or speed, is increased when circuit delay is decreased. Circuit delay includes two main components: logic delay and routing delay.
Using logical axioms and Boolean algebraic rules, it is possible to partially collapse a circuit design to reduce the number of logic levels, thus reducing the routing delay. However, this creates wide fan-in nodes. In FPGAs having four-input LUTs, wide fan-in nodes require use of several levels of LUTs for implementation. Therefore, to implement wide fan-in nodes, multiple levels of CLBs must be used. The requirement to use multiple levels of CLBs increases the logic delay as well as creating other routing delays. These negative effects cancel out the benefits from the routing delay reduction provided by the partial collapse of the circuit design.
Accordingly, there is a need for a method to implement wide fan-in nodes in FPGAs while avoiding the negative effects described above. Additionally, there is a need for CLB and CLB slice designs that allow for fast implementation of wide fan-in SOP functions.
According to one aspect of the invention, a CLB has two or more slices, each slice having an output. The CLB also includes a second-level circuit for combining the outputs from the slices.
According to another aspect of the invention, a CLB has at least one slice. The slice has at least two configurable function generators receiving a plurality of inputs and generating, together, a first output. The slice also includes a combining gate for combining the first output with a combining gate input to generate a combining gate output wherein the combining gate input is an input to the first CLB slice and wherein the combining gate output is an output of the first CLB slice.
According to a further aspect of the invention, a CLB has at least one slice. The slice has a first configurable function generator generating a first output, a second configurable function generator generating a second output, and a dedicated function generator for receiving the first output and the second output to generate a dedicated output. The dedicated function generator includes a first logic gate with an output, a second logic gate with an output, and a multiplexer allowing selection between the two logic gate outputs.
According to yet another aspect of the invention, a CLB has two or more slices. Each of the slices has a first configurable function generator generating a first output, a second configurable function generator generating a second output, and a dedicated function generator for receiving the first output and the second output to generate a dedicated output. The dedicated function generator includes a first logic gate and a second logic gate. The CLB also has a second-level circuit for combining the dedicated outputs from its slices.
Other aspects and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.