The invention relates to Programmable Logic Devices (PLDS) such as Field Programmable Gate Arrays (FPGAs). More particularly, the invention relates to an FPGA logic element having variable-length shift register capability.
Programmable logic devices (PLDs) are a well-known type of digital integrated circuit that may be programmed by a user to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of configurable logic blocks (CLBs) surrounded by a ring of programmable input/output blocks (IOBs). The CLBs and IOBs are interconnected by a programmable interconnect structure. The CLBs, IOBs, and interconnect lines are typically programmed by loading a stream of configuration data (bitstream) into internal configuration memory cells that define how the CLBs, IOBs, and interconnect are configured. The configuration data may be read from memory (e.g., an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
CLBs typically include both logic elements and storage elements (e.g., flip-flops). Each logic element implements a logic function of the n inputs to the logic element according to how the logic element has been configured. Logic functions may use all n inputs to the logic element or may use only a subset thereof. A few of the possible logic functions that a logic element can be configured to implement are: AND, OR, XOR, NAND, NOR, XNOR and mixed combinations of these functions.
One known implementation of the logic element includes a configurable lookup table that is internal to the logic element. This lookup table includes 2n individual memory cells, where n is the number of input signals the lookup table can handle. At configuration, in this architecture a bitstream programs the individual memory cells of the lookup table with a desired function by writing the truth table of the desired function to the individual memory cells.
One memory cell architecture appropriate for use in the lookup tables is shown in FIG. 1 and described by Hsieh in U.S. Pat. No. 4,821,233. A memory cell of this architecture is programmed by applying the value to be written to the memory cell on the data input line DATA and strobing the corresponding address line ADDR. Further, although memory cell M is implemented using five transistors, other known configurations, e.g., six transistor static memory cells, also are appropriate choices for implementing the memory cells of the lookup table. As shown in FIG. 1, inverter 726 may be included to increase the drive of memory cell 700, and avoid affecting the value stored in memory cell 700 unintentionally via charge sharing with the read decoder.
After configuration, to use a lookup table, the input lines of the configured logic element act as address lines that select a corresponding memory cell in the lookup table. For example, a logic element configured to implement a two-input NAND gate provides the corresponding value {1,1,1,0} contained in the one of the four memory cells corresponding to the current input pair {00, 01, 10, 11}, respectively. The selection of the memory cell to be read is performed by a decoding multiplexer, which selects a memory cell from the lookup table on the basis of the logic levels on the input lines.
FIG. 2 shows a block diagram of an exemplary 4-input lookup table including 16 memory cells 7001 through 70016 and a decoding multiplexer 200. Multiplexer 200 propagates a value stored in exactly one of the memory cells 7001-70016 of the lookup table to an output X of the lookup table as selected by the four input signals F0-F3. (In the present specification, the same reference characters are used to refer to terminals, signal lines, and their corresponding signals.)
FIG. 3 is a schematic diagram of a known 2-input lookup table. This lookup table is implemented using four memory cells 7001-70044 and a two-input decoding multiplexer 200 with two input signals, F0 and F1. The two-input decoding multiplexer 200 is shown in detail as being implemented by a hierarchy of pass transistors, which propagate the value stored in the selected memory cell to the output X of the logic element. In FIG. 3, the memory cells can be implemented, for example, as shown in FIG. 1.
The above architecture was later augmented to enhance the functionality of the lookup tables. Freeman et al., in U.S. Pat. No. 5,343,406, describe how additional circuitry can enable lookup tables to behave as random access memories (RAMs) that can be both read and written after configuration of the logic device. When the option of allowing the user to write data to memory cells is available, there also must be provision for entering the user""s data into these memory cells and reading from the memory cells. This capability is provided by including two means for accessing each dual function memory cell, one of which is used to supply the configuration bitstream from off the chip, and the other of which is used during operation to store values from signals that are routed from the interconnect lines of the FPGA.
FIG. 4 shows the memory cell architecture described by Freeman et al. in U.S. Pat. No. 5,343,406, which allows memory cell 750 to be programmed both during and after configuration. During configuration, memory cell 750 is programmed using the same process for programming the memory cell of FIG. 1. After configuration, memory cell 750 is programmed differently. A value to be written to memory cell 750 is applied through the interconnect structure of the FPGA to the second data line 705, and then the corresponding write-strobe line WS for the memory cell is pulsed. This pulse latches the value on line 705 into memory cell 750. Like the lookup table of FIG. 2, which uses a series of memory cells from FIG. 1, a series of memory cells from FIG. 4 are combinable into a lookup table. The resulting lookup table can also be optionally used as a RAM after the conclusion of the configuration process.
FIG. 5 is a block diagram showing a 4-input lookup table with synchronous write capability. The lookup table of FIG. 5 includes a write strobe generator 504 that receives a clock signal CK and a write enable signal WE, and creates a single write strobe signal WS for the lookup table. To write a value to a desired memory cell, for example memory cell 7505, the value is applied on line Din and the address of the desired memory cell 7505 is applied to the input lines F0-F3 of demultiplexer 500. The value is then latched into the desired memory cell 7505 by pulsing the write strobe signal WS. Conversely, to read a value stored in a different desired memory cell 7503, the address of the memory cell 7503 is applied to the input lines F0-F3 of decoding multiplexer 200 (without pulsing the write strobe), as was described with reference to FIGS. 2 and 3.
FIG. 6 is a schematic illustration of a 2-input lookup table with synchronous write capability. The lookup table of FIG. 6 includes four memory cells 7501 through 7504. Details of demultiplexer 500 and multiplexer 200 are shown in FIG. 6.
One or more 4-input lookup tables, such as those illustrated in FIGS. 2 and 5, are typically used to implement combinatorial function generators in a CLB. Because a 4-input lookup table is only capable of storing 16 bits of data, CLE architectures have been designed that allow the combination of two lookup tables to form larger structures. For example, some CLBs include a third function generator selecting between the outputs of two 4-input lookup tables, which enables the CLB to implement any 5-input function. One such CLB, implemented in the Xilinx XC4000-Series FPGAs, is described in pages 4-9 through 4-21 of the Xilinx 1998 Data Book entitled xe2x80x9cThe Programmable Logic Data Book 1998xe2x80x9d, published in 1998 and available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124. (Xilinx, Inc., owner of the copyright, has no objection to copying these and other pages referenced herein but otherwise reserves all copyright rights whatsoever.)
The third function generator can be replaced by a 2-to-1 multiplexer with a signal selecting between the outputs of the two 4-input lookup tables, as disclosed in U.S. Pat. No. 5,349,250 entitled xe2x80x9cLogic Structure and Circuit for Fast Carryxe2x80x9d by Bernard J. New. Replacing the third function generator with a 2-to-1 multiplexer still provides any function of up to five inputs, and uses less silicon area than a third function generator. One FPGA using two 4-input lookup tables and a 2-to-1 multiplexer to implement a 5-input function generator is the XC5200(trademark) family of FPGAs from Xilinx, Inc. The XC5200 CLB is described in pages 4-188 through 4-190 of the Xilinx 1996 Data Book entitled xe2x80x9cThe Programmable Logic Data Bookxe2x80x9d, published in July of 1996 and available from Xilinx, Inc.
A configurable logic block (CLB) capable of generating 6-input functions is described by Young et al. in U.S. Pat. No. 5,920,202 and implemented in the Virtex(copyright) family of FPGAs from Xilinx, Inc. The outputs of four 4-input function generators are combined in pairs using two 2-input multiplexers, then the outputs of the two 2-input multiplexers are combined using a third 2-input multiplexer. The Virtex CLB is described in pages 3-79 to 3-82 of the Xilinx 2000 Data Book entitled xe2x80x9cThe Programmable Logic Data Book 2000xe2x80x9d, published in 2000 and available from Xilinx, Inc.
While 6-input functions are useful, it is even more desirable to have the ability to efficiently implement functions with any number of inputs. Bauer et al. describe a lookup table having such abilities in U.S. Pat. No. 6,118,298, entitled xe2x80x9cStructure for Optionally Cascading Shift Registersxe2x80x9d. Bauer""s lookup table, shown in FIGS. 7-9, is configurable as both a (log2n)-input lookup table and an n-bit cascadable shift register.
FIG. 7 shows a schematic illustration of a memory cell 7702 of Bauer""s lookup table. When configured in shift register mode, a value can be shifted from a preceding memory cell 7701 into memory cell 7702. Memory cell 7702 includes a pass transistor 706. The configuration value is written into memory cell 7702 by pulsing configuration control line 702 of transistor 706, while applying the configuration value to the data line 704. The output of memory cell 7702 is programmably connected to the input of a next memory cell 7703 by pass transistor 7202, inverter 7262, and a next pass transistor 7083 not shown in FIG. 7. As explained in detail by Bauer, by using non-overlapping two-phase clocking on clock lines PHI1 and PHI2, the memory cells shift one bit from left to right for every clock cycle.
FIG. 8 shows a logic element that implements a 16-bit shift register and 4-input lookup table as shown and described by Bauer. For simplicity, in FIG. 8 the structures within memory cells 770 of FIG. 7 have not been explicitly illustrated. In FIG. 8, when in shift register mode, a first memory cell 7701 of the memory is programmed with an initial value. The memory cell""s value may be overwritten with a new value by applying the new value to the Din terminal of the first memory cell 7701 and strobing the clock line, CK. The strobing of CK in turn invokes a two-phase clocking cycle on non-overlapping two-phase clock signals PHI1 and PHI2 (generated by clock generator 800). As data is moved synchronously from left to right in the shift register, i.e., from the first memory cell 7001 to a last memory cell 70016, the logic element can continue to act as a lookup table, although the function changes with every clock cycle. The decoding multiplexer 200 provides on output line X the contents of the memory cell selected by the user inputs F0-F3.
FIG. 9 shows a structure for implementing a 2-input lookup table or a 4-bit shift register, and shows the internal structure of multiplexer 200 and memory cells 7701 through 7704. FIG. 9 is oriented on the page the same way as FIG. 8, and thus assists in understanding the relationship between the elements that make up the lookup table/shift register logic element.
Bauer also showed and described a logic element configurable as an n-bit shift register, an n-bit random access memory, and a (log2 n)-input lookup table. FIGS. 10-12 illustrate this logic element. FIG. 10 illustrates the memory cell. The memory cell of FIG. 10 can be loaded from three different sources. During configuration, memory cell 7902 is loaded by applying configuration data to line 704 and strobing control line 702 of transistor 706. When memory cell 7902 is in shift register mode, it is loaded through transistor 708, as discussed above. When memory cell 7902 is in RAM mode, it is loaded through demultiplexer 500 on line 7052. Write strobe line WS is pulsed, turning on transistor 707, and thus applying a data signal to node 730.
FIG. 11 shows a logic element that implements any one of a 16-bit shift register, a 16-bit random access memory, and a 4-input lookup table, as shown and described by Bauer. In this logic element, a memory cell 7905 of the lookup table is programmed with an initial value during configuration, as discussed above. Subsequently, the initial value may be replaced in either of two ways, depending on the mode of the logic element: shift or RAM. When the lookup table including memory cells 790 is being used in RAM mode, each memory cell 790 receives its data input on RAM input line 705. To write to any memory cell 790, the write strobe line WS pulses, thereby driving the value of Din through demultiplexer 500 into the addressed memory cell via input line 730.
The operation of the logic element in each of these modes is controlled by control logic 1000. Control bits that specify whether the logic element is in RAM mode, shift mode, or neither (RAM, Shift) are provided to control logic unit 1000. Control logic unit 1000 also receives the user clock signal CK and the write enable signal WE. From these inputs, control logic unit 1000 outputs PHI1, PHI2 and write strobe signal WS to either shift data between memory cells, to write to a particular memory cell, or to leave the memory cell data untouched. When in shift register mode, as in the logic element of FIG. 8, data is moved synchronously from left to right in the shift register, i.e., from the first memory cell 7901 to a last memory cell 79016, as described above, by invoking a two-phase clocking cycle when CK is strobed. On the other hand, when the logic element is configured as a random access memory (RAM), the addressing lines F0-F3 select one of the memory cells (7901 through 79016) to be written to and read from by using the demultiplexer 500 and the decoding multiplexer 200, respectively. When in shift register mode, the first memory cell 7901 receives as its input the signal applied to line Din. When in RAM mode, memory cell 7901 receives an input signal on line 7051 from demultiplexer 500.
In RAM mode, to write to a given memory cell, say 7005, the write enable line WE must be active. When the user clock signal CK is asserted in conjunction with the active WE signal, control logic unit 1000 generates a write strobe WS. When the write strobe WS is high, memory cell 7005 addressed by address lines F0-F3 of the demultiplexer 500 receives the value from data input line Din This value overwrites the previous contents of the memory cell 7005. No other memory cells receive the value applied to Din since they are not addressed and therefore are separated from Din by high impedance connections from the demultiplexer 500.
FIG. 12 is a schematic illustration showing more detail of a lookup table/shift/RAM logic element as shown and described by Bauer. Collectively, demultiplexer 500, decoding multiplexer 200, pass transistors 708 and 720, inverters 726, and RAM mode pass transistors 707 form an interconnection network and are combined with memory cells (7901 through 7904) and control logic unit 1000 to implement the logic element. If the logic element is not configured as a shift register, then the logic element acts as either a random access memory or a lookup table. In either non-shift register mode, PHI2 is maintained at a low level, deactivating pass transistors 708, thereby blocking data from one memory cell 790i from affecting the next memory cell 790i+1. Also, in the non-shift register modes PHI1 is maintained at a high logic level, thereby feeding the outputs of the memory cells (7901 to 7904) through to the decoding multiplexer 200. As in the previous examples, the output of the logic element is selected by the decoding multiplexer 200 according to the user inputs F0 and F1.
When the logic element of FIG. 12 is configured as a shift register, the RAM mode pass transistors 707 are turned off because WS is held low, isolating the memory cells from the outputs of demultiplexer 500. Memory cell 7901 is programmably connected to Din, through transistor 7081. To shift values, control logic unit 1000 produces control signals PHI1 and PHI2, triggered while the write enable signal WE is active by a rising edge of the user clock signal CK applied to control logic unit 1000, such that values are shifted from one memory cell to the next memory cell, i.e., from memory cell 790ixe2x88x921 to memory cell 790i, and from memory cell 790i to memory cell 790i+1. When control logic unit 1000 receives a rising edge of the user clock signal CK, control logic unit 1000 first pulls PHI1 low, then pulses PHI2 high long enough to overwrite the contents of the memory cells (7901 to 7904), and lastly reasserts PHI1 after PHI2 has fallen. It is important for extremely low clocking frequencies that PHI2 be only a pulse since PHI1 must be off while PHI2 is on. To accomplish this goal, the control logic is designed so that PHI1 and PHI2 do not rely on the falling edge of the user clock signal CK, but are self-timed.
Bauer also shows various circuits that can be implemented using the configurable logic elements shown in FIGS. 7-12. For example, Bauer creates shift registers larger than 16 bits by concatenating the 16-bit shift registers of FIGS. 8 and 11. To implement these larger shift registers, Bauer""s structure provides a configurable connection between the output of a shift register in one logic element and the input to the shift register in the next logic element.
FIG. 13 is a block diagram of a logic element having such a configurable connection. Sixteen memory cells 0-15 are serially coupled to form a 16-bit shift register shifting from bit 0 to bit 15, with the shift register output being provided on a logic element output terminal S_OUT. The shift function timing is controlled by control circuit 800, which is similar to control circuit 800 of FIG. 8. Multiplexer 201 provides either the shift register output from the previous logic element (S_IN) or a user input value (FEED) to the input terminal of memory cell 0. Multiplexer 201 is controlled by a configurable memory cell 202. Each memory cell 0-15 provides one bit to a multiplexer 200, which selects one of these bits under control of user input signals F0-F3, and provides the selected bit value to output terminal X of the logic element.
As noted by Bauer, a shift register having fewer stages than the number of memory cells in a lookup table can be formed by directing a bit other than the last bit to output terminal X. For example, the dotted line in FIG. 13 shows the output of memory cell 7 being directed to output terminal X, by appropriate selection of user input signals F0-F3. In this configuration, the logic element forms an 8-bit shift register. If desired, the signal on output terminal x (e.g., bit 7) can then be directed to user input FEED of another logic element, to extend the shift register. However, note that when Bauer""s structure is used, only one shift register can be implemented in a single logic element. Therefore, two 16-bit logic elements are required to implement two 8-bit shift registers, with 8 bits in each logic element being unused.
Therefore, it is desirable to provide a logic element having configurable shift register capability, wherein shift registers of variable length are easily and efficiently implemented. It is further desirable to provide a logic element that can be used to implement two or more shift registers.
Shift registers are often used when implementing filters and cyclic redundancy check (CRC) circuits. In such designs, it is often necessary to xe2x80x9ctapxe2x80x9d certain bits in the shift register. A shift register xe2x80x9ctapxe2x80x9d is a path by which a single bit from the shift register can be read without affecting the flow of data through the shift register. Using Bauer""s shift register, each logic element provides only one tap, although any bit in the shift register can be tapped. For example, in the shift register of FIG. 13, multiplexer 200 can be controlled to provide any single bit in the shift register. However, shift registers in many user circuits require more than one tap. Therefore, it is desirable to provide a logic element that can provide more than one tap.
The invention provides a logic element for a programmable logic device (PLD) that can be configured as a shift register of variable length. An array of memory cells in the logic element is divided into two or more portions. The memory cells of each portion supply values to a corresponding output multiplexing circuit, thereby enabling the logic element to function as a lookup table by combining the outputs of the multiplexing circuits. However, each portion is also configurable as a shift register. The portions can function as separate shift registers, or can be concatenated to function as a single shift register. In some embodiments, the portions can also be concatenated with shift registers in other logic elements. One embodiment includes two portions. Other embodiments include more than two portions, and are therefore configurable as more than two shift registers.
Because two or more multiplexing circuits are available, two or more taps are provided, one from each portion of the memory array.
In one embodiment having a memory cell array divided into two portions, each portion has a configurable source for the shift in input. The first portion can be configured to accept a value from any of: a shift out value from the second portion of another logic element; a tap value read from the second portion of the other logic element; a value supplied by a source external to the logic element; and a value created by selecting between the values read from the first and second portions of the previous logic element. The second portion can be configured to accept a value from any of: a shift out value from the first portion of the same logic element; a tap value read from the first portion of the same logic element; and a value supplied by a source external to the logic element.
In one embodiment, the logic element is configurable as a lookup table and a shift register of configurable length. In another embodiment, the logic element is further configurable as a RAM. In another embodiment, the logic element is further configurable as a product term generator.
In some embodiments, the array of memory cells includes only one column, with that column being configurable as a 1-bit shift register. In other embodiments, the array includes two or more columns of memory cells, along with decode logic to select a single column from the array. In one such embodiment, only one column is used to implement the shift register, i.e., the shift register is one bit wide. In another embodiment, more than one or all rows of memory cells participate in the shift function, i.e., the shift register is more than one bit wide. In another embodiment, the columns of memory cells can be configurably concatenated to form a larger shift register within a single logic element.