This invention relates to programmable logic devices, and in particular to configurable logic blocks of field programmable gate arrays.
FIG. 1(A) is a simplified diagram showing a basic Field Programmable Gate Array (FPGA) 100, which is a type of Programmable Logic Device (PLD). FPGA 100 includes an array of configurable logic blocks (CLBs) CLB-1,1 through CLB-4,4 surrounded by input/output blocks (IOBs) IOB-1 through IOB-16, and programmable interconnect resources that include vertical interconnect segments 120 and horizontal interconnect segments 121 extending between the rows and columns of CLBs and IOBs. Each CLB includes configurable combinational circuitry and optional output registers programmed to implement a portion of a user""s logic function. The interconnect segments of the programmable interconnect resources are configured using various switches to generate signal paths between the CLBs that link the logic function portions. Each IOB is similarly configured to selectively utilize an associated pin (not shown) of FPGA 100 either as a device input pin, a device output pin, or an input/output pin. Although greatly simplified, FPGA 100 is generally consistent with FPGAs that are produced by Xilinx, Inc. of San Jose, Calif.
FIGS. 1(B) through 1(D) are simplified diagrams showing examples of the various switches associated with the programmable interconnect resources of FPGA 100. FIG. 1(B) shows an example of a six-way segment-to-segment switch 122 that selectively connects vertical wiring segments 120(1) and 120(2) and horizontal wiring segments 121(1) and 121(2) in accordance with configuration data stored in memory cells M1 through M6. Alternatively, if horizontal and vertical wiring segments 120 and 121 do not break at an intersection, a single transistor makes the connection. FIG. 1(C) shows an example of a segment-to-CLB/IOB input switch 123 that selectively connects an input wire 110(1) of a CLB (or IOB) to one or more interconnect wiring segments in accordance with configuration data stored in memory cells M7 and M8. FIG. 1(D) shows an example of a CLB/IOB-to-segment output switch 124 that selectively connects an output wire 115(1) of a CLB (or IOB) to one or more interconnect wiring segments in accordance with configuration data stored in memory cells M9 through M11.
Since the first FPGA was invented in the 1980""s, variations on the basic FPGA circuitry have been devised that allow FPGAs to implement specialized functions more efficiently. For example, special interconnect lines have been added to allow adjacent CLBs to be connected at high speed and without taking up general interconnect lines. In addition, hardware has been placed between adjacent CLBs that allows fast carry signal transmissions when an FPGA is configured to implement an arithmetic function or certain wide logic functions. Finally, the circuitry associated with the CLBs has undergone several changes that allow each CLB to implement specialized functions more efficiently. Such CLB modifications are particularly relevant to the present invention.
FIG. 2(A) is a simplified schematic diagram showing a prior art CLB 200 used in the XC4000(trademark) series of FPGAs produced by Xilinx, Inc. CLB 200 includes a first four-input lookup table (LUT) F, a second four-input LUT G, a three-input LUT H, a set of LUT output multiplexers (MUXes) 210, optional output registers FF-1 and FF-2, and additional circuitry for routing signals within CLB 200. LUT F receives data input signals F1 through F4 that are transmitted from the interconnect resources of the FPGA. Similarly, LUT G receives data input signals G1 through G4. The operation of LUTs F and G is described in detail below. In addition to the eight data input signals F1 through F4 and G1 through G4, CLB 200 receives a clock signal CLK, and data/control signals H1, DIN/H2, SR/H0, and EC. By selectively configuring the various programmable elements associated with CLB 200, CLB 200 generates output signals in response to the data and control signals that are consistent with an assigned portion of a user""s logic function.
FIG. 2(B) is a diagram showing a circuit that can implement four-input LUTs F and G in CLB 200. Each four-input LUT includes a memory circuit 230 having sixteen memory bits M0 through M15 and a MUX structure 240. The programmed state of each of memory bits M0 through M15 is transmitted to MUX structure 240 on lines 235. MUX structure 240 selectively passes the programmed state of one of the memory bits to output terminal 245 in response to the four input signals (either F1 through F4 or G1 through G4). Functionally described, MUX structure 240 includes a series of two-input MUXes controlled by the four input signals. Each combination of four input signals produces a unique address that causes the LUT to output the contents of one of memory bits M0 through M15 of memory circuit 230.
FIG. 2(C) is a simplified circuit diagram showing memory bit M0 of memory circuit 230 (see FIG. 2(B)). Memory bit M0 includes first and second inverters connected end-to-end to form a latch 231 that is connected to BIT and BIT_b (inverted bit) lines via pass transistors 232 and 233, respectively, and a third inverter 234 that is connected between latch 231 and the output line 235-1. Pass transistors 232 and 233 are controlled by a WRITE control line. During a configuration mode, the WRITE line is pulled high and data is transmitted to the latch via the BIT and BIT_b lines. During subsequent operation, the data bit stored by the latch is transmitted through the third inverter 234 and applied to output line 235-1, which transmits the data bit to MUX structure 240.
Four-input LUTS F and G of CLB 200 have proven extremely useful for implementing many logic functions. However, a problem arises when certain large logic functions are implemented that require signal transmission through four or more CLBS.
FIG. 3 is a simplified diagram showing a portion 300 of an FPGA that includes six CLBS. The interconnect resources associated with portion 300 are programmed to provide a signal path 310 for transmitting data signals between selected CLBS. Specifically, signal path 310 defines the transmission path of an input signal transmitted to LUT F of CLB-1,1, the output signal from LUT F of CLB-1,1 that is transmitted to LUT F of CLB-1,2, the output signal from LUT F of CLB-1,2 that is transmitted to LUT G of CLB-2,2, the output signal from LUT G of CLB-2,2 that is transmitted to LUTs G and H of CLB-2,3, and the output signal from LUT H of CLB-2,3.
Signal path 310 represents one of many signal paths typically associated with a user""s logic function. Other signal paths are used, for example, to transmit additional input signals to LUT F of CLB-1,1. (These additional signal paths are indicated in an abbreviated manner by the short lines extending from CLB-1,1.) The interconnect resources used by these additional signal paths are not shown, so that signal path 310 is clearly identified.
The various components of the CLBS, IOBs, and interconnect resources of a PLD introduce signal delays that delay signals through the PLD. For example, delays are introduced as the signal passes through the various switches associated with an FPGA (see FIGS. 1(B) through 1(D), discussed above). Even larger delays are typically produced by the propagation of signals through the CLBs of an FPGA. As mentioned above, an output signal from each four-input LUT F/G is passed through four MUXes from a selected memory cell that is addressed by the four input signals. The delay associated with the transmission through the four MUXes of each four-input LUT is approximately 1 nanosecond (ns). Additional delays are subsequently produced by the LUT output MUXes 210.
PLD users often impose timing restrictions on one or more signal paths in a logic function implemented in a target PLD. These timing restrictions, or xe2x80x9cconstraintsxe2x80x9d, define a maximum period allowed for a signal to propagate along a particular path. A signal path is referred to as a xe2x80x9ccriticalxe2x80x9d path if it limits the maximum clock rate of a circuit. Some signals may be transmitted through relatively few CLBs, thereby experiencing a relatively short propagation delay. Conversely, other signals may be transmitted through a relatively large number of CLBs, thereby experiencing a relatively large delay, and one of these signals is often on the critical path. Therefore, it is important to minimize the number of CLBs through which a signal travels along a critical path.
One approach to minimizing the propagation delay associated with signal transmission through multiple CLBs is to provide large general-purpose logic circuits that can implement large portions of a user""s logic function. As mentioned above, when the CLBs of an FPGA include small logic circuits (e.g., four-input LUTs), a user""s logic function must be partitioned into relatively small logic portions that can be implemented in these small logic circuits. Partitioning a large logic function into multiple small logic portions can cause the failure of one or more paths of the logic function to meet the user""s timing constraints. By providing large logic circuits, it is possible for place-and-route software to partition the user""s logic function into larger logic portions that can be efficiently implemented in the large logic circuits such that propagation delays are minimized.
Large general-purpose logic circuits have been provided in some PLDs in the form of programmable logic array (PLA) or programmable array logic (PAL) circuits. Unlike LUTS, PLA and PAL circuits utilize AND/OR logic arrangements to implement logic functions. While PLA and PAL circuits typically implement wide logic functions faster than LUTS, they are restricted by this AND/OR logic arrangement. In general, a LUT is capable of implementing more complex logic functions than a PLA and PAL circuit having a comparable size.
What is needed is a CLB for an FPGA that allows the implementation of large logic functions using a LUT logic arrangement while utilizing a limited amount of space. What is also needed is a logic/memory circuit for an FPGA that can be operated as either a LUT or a PLA/PAL, thereby allowing a user to selectively implement portions of a logic function in either of these logic circuit types.
The present invention is directed to a multi-purpose logic/memory circuit (LMC) utilized in a configurable logic block (CLB) of a programmable logic device (PLD) that can implement high capacity lookup table (LUT) operations, RAM operations using the same array of programmable elements (memory cells), and high-speed programmable array logic (PAL) operations. Because the same array of programmable elements is selectively used for LUT, RAM or PAL operations, the LMC of the present invention provides a highly versatile logic circuit that can implement a user""s logic function in a highly efficient manner.
In accordance with an aspect of the invention, an LMC implements either an eight-input lookup table (LUT) or a 256-bit RAM using the same array of programmable elements. A first subset of the eight input signals is used to address a word (i.e., 16 programmable elements) stored in one column of the array, and a second subset of input signals is used to pass one or more bits from the selected word to a set of output terminals. The resulting eight-input LUT provides substantially greater capacity than prior art 16-bit LUTs and, therefore, is capable of implementing substantially larger portions of a user""s logic function while taking up minimal additional space. Further, because larger logic functions can be implemented in a single eight-input LUT, the propagation delays associated with signal transmissions between multiple 16-bit LUTs can be avoided. Moreover, independent read bit lines are utilized to minimize capacitance during read operations, thereby providing faster operating speeds.
In accordance with another aspect of the invention, a PAL input signal control circuit is used to transmit input signals directly to the write bit lines of the array. These input signals, along with the bit values stored in the programmable elements, are transmitted to product term generation circuitry that generates product terms. The LMC is also provided with a macrocell that generates a sum-of-products term in response to the product terms. The sum-of-products term is selectively transmitted during PAL operations, thereby allowing a user the option of implementing speed-sensitive logic using the high-speed PAL circuitry.
In accordance with another aspect of the invention, an LMC includes a logic/memory array including four columns of programmable elements that are addressed by a hard-wired decoder during write operations. Bit values are read from each programmable element through series pass transistors that are controlled by read address signals generated by the hard-wired decoder, thereby increasing operating speeds during LUT and RAM operations. Further, by limiting the number of programmable elements connected to each read bit line to four, minimal capacitance is applied to the read bit lines, thereby further increasing operating speeds.