Programmable logic devices such as field programmable gate arrays (FPGAs) include a number of programmable logic blocks that are interconnected by a programmable interconnect, also referred to as a routing structure. Each programmable logic block generally includes a number of lookup tables (LUTs). During FPGA configuration for logic use, a user programs the truth table for each lookup table to implement a desired logical function. The core unit of a programmable logic block is a LUT-register combination often denoted as a “logic cell” as seen in FIG. 1. A four-input (16 bit) LUT 100 receives LUT inputs A through D from the routing structure (not illustrated). Based upon the truth table programmed into LUT 100 during configuration of the corresponding FPGA, a combinatorial output 105 is “looked up” as determined by the state of inputs A through D. Output 105 may also be provided as a sequential output 110 after registration in a register 120. Register 120 may also register a data input 125 through appropriate selection in a multiplexer 130.
The core LUT/register logic cell combination discussed with respect to FIG. 1 may be organized into what is commonly referred to as a “slice.” The bit size of a slice depends upon the number of LUT/register combinations it contains. For example, slice 200 illustrated in FIG. 2 contains two LUT/register combinations 205a and 205b and is thus a two-bit slice. Within slice 200, registers 120 share common set and reset signals 126. In addition, clock and clock enable signals 210 are also common to both registers 120a and 120b. A multiplexer 140 selects from combinatorial output signals 105a and 105b (from LUTs 100a and 100b, respectively) to provide an output signal FX0 160. Because both output signals 105a and 105b are “LUT4” outputs in that LUTs 100a and 100b are 4-input LUTs, FX0 160 represents a 5-input LUT (LUT5) output signal.
As seen in FIG. 3, a programmable logic block 300 may include a plurality of slices 200 such as slice-0 through slice-3. The bit size of the slices is arbitrary—for example, rather than use two-bit slices, programmable logic block 300 may include four-bit slices. As known in the art, various interconnections exist amongst slices 200 within programmable logic block 300. For example, as seen in FIG. 2, slice 200 may include a multiplexer 150 that selects from input signals FXA and FXB to provide an output signal FX1 170. In addition, a carry chain 180 couples across LUTs 100a and 100b. Carry chain 180 extends across all the LUTs (not illustrated) within slices 200 of programmable logic block 300 as well. To allow the formation of LUT6, LUT7, and LUT8 output signals (corresponding to the output signal of a 6-input LUT, a 7-input LUT, and an 8-input LUT, respectively), output signals FX1 and FX0 from each slice may couple back as inputs FXA and FXB in various fashions.
For example, in slice 0 (FIG. 3), input signal FXA is received as the FX0 output signal from slice 1 whereas input signal FXB is received as the slice 0 FX0 output signal. Because each FX0 output signal may be a LUT5 output signal as discussed with regard to FIG. 2, FX1 from slice 0 is thus a LUT6 output signal. An analogous situation exists for slice 2 in that its FXA input signal is received as the FX0 output signal from slice 3 whereas its FXB input signal may be the FX0 output signal from slice 2. Thus, the FX1 output signal from slice 3 may be a LUT6 output signal. But note that the FXA input signal for slice 1 is received as the FX1 output signal from slice 2. Similarly, the FXB input signal for slice 1 is received as the FX1 output signal from slice 0. Thus, the FX1 output signal from slice 1 is a LUT7 signal. To allow for the formation of a LUT8 output signal, slice 3 receives as its FXA input signal a LUT7 signal cascaded from another programmable logic block (not illustrated). Slice 3 also receives as its FXB input signal the FX1 output signal (LUT7) from slice 1. Thus, the FX1 output signal from slice 3 may be a LUT8 output signal. It will be appreciated that other types of interconnections exist between slices but are not shown for illustration clarity.
In certain FPGA designs, the bit size of the slice encompasses the entire programmable logic block in what may be denoted as a block-based approach such that all registers in a block-based programmable logic block receive common control signals. Regardless of the bit size used for the slices, it may be seen from examination of FIG. 2 that a one-to-one correspondence exists between LUTs 100 and registers 120 within each logic cell. The symmetry resulting from this one-to-one arrangement has obvious advantages such as ease of use. Synthesis, mapper, and placer and router tools have been optimized in view of this one-to-one correspondence. However, it has been observed that a register-to-LUT usage ratio for the vast majority of user designs ranges from 40% to 60%. A fixed one-to-one LUT-register ratio thus often results in a substantial waste of register resources. This waste leads to silicon die inefficiency and thus higher manufacturing costs.
Accordingly, there is a need in the art for improved programmable logic block architectures that provide a more efficient use of die area.