Field Programmable Gate Arrays (FPGAs) include programmable circuits. These programmable circuits are constructed with programmable look up tables (LUTs) and Registers (or Flip-Flops) to implement logic as shown in FIG. 1. LUTs provide the means to program a logic function of two or more inputs, and registers provide the means to store either input or output values for subsequent use. A K-LUT 104 can implement a K-input function. In addition to LUTs, NAND, MUX and many other programmable logic elements can also implement logic. A programmable interconnect matrix (101, 102, 103, 108, 109) provides the means of coupling inputs and outputs as required by the logic function implemented in programmable logic element 107. In the prior-art logic element of FIG. 1, the LUT logic 104 output is fed to the Flip-Flop (FF) 105. The user can decide to store LUT output in the FF for synchronous logic implementations, or by-pass the FF for asynchronous logic implementations. A basic logic element (BLE) 106 comprises a LUT circuit 104 and a Flip-Flop 105. One or more BLEs may be combined to form a complex logic block (CLB) 107. Inputs to LUT 104 are received via the routing wires 101, input MUX 102, and local MUX 103. Output of LUT 104 or the FF 106 is routed through programmable points 108 back to the routing wires 109.
A plurality of logic elements are combined by FPGA tools to generate larger logic functions. When larger logic functions are implemented, unused logic within BLEs add to inefficiency of Silicon utilization and extra cost to end users. When larger logic functions are implemented, wires are used to connect the logical components. Wire congestion leads to sparse utilization of available logic, further adding to inefficiency in Silicon utilization. A logic function frequently required by the users is shift-registers within the FPGA fabric.
In a shift register, shown in FIG. 2A, data is presented as IN (shown from extreme left) and is shifted right each time when the clock goes high. At each clock the IN (the bit on the extreme left of register 201) appears on the first flip-flop 201 output (MSB). The bit on the extreme right (LSB) of register 204 is shifted out and lost.
For example, in the four bit shift register in FIG. 2A, with the first register 201 storing the MSB and the fourth register 204 storing the LSB, an exemplary shift pattern for an input string of “1010” provides “0101” at the output of the shift register as shown below.
INOUT1OUT2OUT3OUT410000010001010001010x0101
In prior-art FPGAs, shift registers are implemented by connecting a plurality of FFs provided in the logic elements as shown in FIG. 2B. Data is fed as an input to a first LUT in a first logic element 211, and the first LUT output is latched to a first FF. The output of the first FF is routed through the global interconnect matrix as an input to a second LUT in a second logic element 212, and the second LUT output is latched to a second FF. A similar extension of connections allows users to construct larger chains of shift-registers. FPGA tools construct shift-register chains as described. In a realistic implementation of the shift-register, the stages are not necessarily placed adjacent to each other; but rather placed in random locations and routed by global interconnects. A more constrained placement is cumbersome to a user and rarely used. In such constructions, the entire LUT logic block simply acts as a wire to connect to the input of the FF, wasting valuable LUT logic resources that could have been used to implement logic. Wasted LUTs add to the Silicon cost when implementing shift-registers. An alternative scheme to save the extra cost is to provide dedicated shift-registers to the user at pre-determined positions. However, the user requirement and location is not apriori deterministic, and thus pre-positioned additions do not provide the most desirable user solution. In constrained or random placements of shift-registers, the output of the register in 211 is routed as an input to register in 212 by using global interconnects. These global wires tie up valuable horizontal & vertical wires—valuable resources that could be useful to connect other logical structures. Thus wire congestion is a significant challenge to automated place and route tools that must determine how these shift registers are placed and routed within the FPGA.