Programmable logic devices (PLDs) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops (DLLs), and so forth.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more “function blocks” connected together and to input/output (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAs) and Programmable Array Logic (PAL) devices. In CPLDs, configuration data is typically stored on-chip in non-volatile memory. In some CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.
For all of these programmable logic devices (PLDs), the functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory cell. The terms “PLD”, “programmable logic device”, and “programmable integrated circuit” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. For example, one type of programmable IC includes a combination of hard-coded transistor logic and a programmable switch fabric that programmably interconnects the hard-coded transistor logic.
As noted above, advanced FPGAs can include several different types of programmable logic blocks in the array. For example, FIG. 1 illustrates an FPGA architecture 100 that includes a large number of different programmable tiles including multi-gigabit transceivers (MGTs 101), configurable logic blocks (CLBs 102), random access memory (RAM) blocks (BRAMs 103), input/output blocks (IOBs 104), configuration and clocking logic (CONFIG/CLOCKS 105), digital signal processing blocks (DSPs 106), specialized input/output blocks (I/O 107) (e.g., configuration ports and clock ports), and other programmable logic 108 such as digital clock managers, analog-to-digital converters, system monitoring logic, and so forth. Some FPGAs also include dedicated processor blocks (PROC 110).
In some FPGAs, each programmable tile includes a programmable interconnect element (INT 111) having standardized connections to and from a corresponding interconnect element in each adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA. The programmable interconnect element (INT 111) also includes the connections to and from the programmable logic element within the same tile, as shown by the examples included at the top of FIG. 1.
For example, a CLB 102 can include a configurable logic element (CLE 112) that can be programmed to implement user logic plus a single programmable interconnect element (INT 111). A BRAM 103 can include a BRAM logic element (BRL 113) in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as four CLBs, but other numbers (e.g., five) can also be used. A DSP tile 106 can include a DSP logic element (DSPL 114) in addition to an appropriate number of programmable interconnect elements. An IOB 104 can include, for example, two instances of an input/output logic element (IOL 115) in addition to one instance of the programmable interconnect element (INT 111). As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 are manufactured using metal layered above the various illustrated logic blocks, and typically are not confined to the area of the input/output logic element 115.
In the pictured embodiment, a columnar area near the center of the die (shown shaded in FIG. 1) is used for configuration, clock, and other control logic. Horizontal areas 109 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA.
Some FPGAs utilizing the architecture illustrated in FIG. 1 include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA. The additional logic blocks can be programmable blocks and/or dedicated logic. For example, the processor block PROC 110 shown in FIG. 1 spans several columns of CLBs and BRAMs.
Note that FIG. 1 is intended to illustrate only an exemplary FPGA architecture. For example, the numbers of logic blocks in a column, the relative width of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 1 are purely exemplary. For example, in an actual FPGA more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic, but the number of adjacent CLB columns varies with the overall size of the FPGA.
As noted above, one of the dedicated logic elements that can be included in an FPGA or other programmable IC is a BRAM, or block RAM. In some programmable ICs, the block RAM can be configured as a first-in-first-out memory circuit (FIFO). A block RAM can typically be configured to have any of several predetermined aspect ratios. For example, an 18K block RAM in the Virtex-4™ FPGA from Xilinx, Inc. can be configured to implement a FIFO 512, 1024, 2048, or 4096 words deep. However, because the block RAM has a fixed size (e.g., 18K bits), there will always be a maximum size for a FIFO implemented using the block RAM.
In order to increase the size of a FIFO over and above the predetermined maximum, it is common to concatenate (“chain together”) multiple FIFOs. Concatenated FIFO 200 of FIG. 2 demonstrates one way in which two FIFOs can be concatenated to produce a FIFO having twice the depth of a single FIFO, e.g., a FIFO that can be implemented using only one block RAM.
In FIG. 2, both FIFOs 201, 202 operate with free-running clocks. Operation of the concatenated FIFO is controlled through write enable (WEN) and read enable (REN) signals, which drive the WREN and RDEN input terminals of both FIFOs 201, 202. The free-running system write clock signal WCLK and the free-running system read clock signal RCLK are used to clock the leftmost FIFO (input terminals WRCLK and RDCLK of FIFO 201), while all downstream FIFOs (202 and any additional concatenated FIFOs, not shown) use the system read clock RCLK as both read clock and write clock signals. The empty (EMPTY), almost empty (AEMPTY), full (FULL), and almost full (AFULL) output signals are well known status signals commonly provided by FIFOs. (Note that in the present specification, the same reference characters are used to refer to terminals, signal lines, and their corresponding signals.)
The input data flows from the leftmost FIFO 201 downstream to FIFO 202, e.g., from circuit input terminals DI<3:0>, to input terminals DIN<3:0> of FIFO 201, from output terminals DOUT<3:0> of FIFO 201 to input terminals DIN<3:0> of FIFO 202, and finally from output terminals DOUT<3:0> of FIFO 202 to circuit output terminals DO<3:0>. Note that in the pictured example, two 4K×4 FIFOs (201, 202) are concatenated. Therefore, the input and output data busses are 4-bit busses. However, these data widths are purely exemplary, and it is well known that this concatenation method can be applied equally well to FIFOs of other sizes. Further, it will be clear to those of skill in the art that more than two FIFOs can be concatenated using this method, by adding one or more additional FIFOs after the rightmost FIFO 202. An additional NOR gate is needed between the EMPTY output terminal of each upstream FIFO and the WREN input terminal of each succeeding FIFO in the chain, and feeding back to the RDEN input terminal of the preceding FIFO in the chain. The REN signal always drives the RDEN input terminal of the last FIFO in the chain.
The concatenated FIFO of FIG. 2 includes two FIFOs 201, 202 implemented using block RAM in an exemplary PLD. The two FIFOs are connected using a NOR gate 203, which is implemented using logic in the programmable logic fabric of the PLD. NOR gate 203 ensures that no data is written to the second FIFO 202 when either the first FIFO is empty (signal EMPTY from FIFO 201 is high, signaling that no valid data is available from FIFO 201) or the second FIFO is full (signal FULL from FIFO 202 is high, signaling that there is no more room in the second FIFO).
Routing the feedback path from the FULL output terminal of FIFO 202 through the programmable logic, and then to the RDEN input terminal of FIFO 201 and the WREN input terminal of FIFO 202, can have a significant delay. In practice, it has been found that this feedback path can limit the operating frequency of the entire concatenated FIFO. Therefore, it is desirable to provide circuits and methods of concatenating FIFOs in which the maximum clock frequency is not adversely affected by the concatenation.
Additionally, in order for the concatenated FIFO of FIG. 2 to function properly, the leftmost FIFO 201 must be designed or programmed to operate in a first-word-fall-through (FWFT) mode. A FIFO in FWFT mode provides valid read output data (e.g., at output terminal DOUT<3:0> for the FIFOs in FIG. 2) without the need for performing a read cycle from the FIFO. In contrast, in standard mode the first word written to an empty FIFO does not appear on the data output lines until a specific read operation is performed. Not all FIFOs have the capability of operating in FWFT mode, because FWFT mode requires increases internal circuit complexity and resources. Therefore, it is further desirable to provide circuits and methods of concatenating FIFOs that do not require FWFT capability.