Discussion of Related Art
Field programmable devices (FPDs) have grown rapidly because integrated circuits for a wide variety of product applications in a competitive environment require fast time-to-market for new designs and low (or zero) non-recurring engineering cost (NRE) and low fabrication cost. Low power is a requirement for most applications as is portability so conservation of battery power is a requirement and nonvolatile operation is advantageous. Also, integration levels (more function) are increasing rapidly as is the requirement for high performance chips with large logic capacity and field programmability that are in-circuit programmable (in-place in the package without requiring sockets). Field programmable devices (FPDs) are also sometimes referred to as programmable logic devices (PLDs) and the terms FPD and PLD are used interchangeably throughout the application.
What is needed in logic design is fast time to market. Lower costs are also important hence more function in smaller chips. Higher performance and lower power are especially important in battery powered applications. Field programmable logic chips are required for fast time to market. What is needed are configurable (programmable) logic functions and efficient programmable wiring that can be configured (programmed) multiple times in chips mounted on a board. Programmable switches must be small in size and nonvolatile to enable efficient wiring architectures for implementing configurable (programmable) logic functions and be compatible with and easily integrated in CMOS technologies. Programmable switches must be easy to use and compatible with high performance applications. Programmable switches must enable fine-tuning of logic timing for optimum performance.
Overview of Field Programmable Devices
Block diagram 100 illustrated in FIG. 1 shows simple programmable logic devices (SPLDs) with a smaller number of equivalent logic gates with thousands or tens of thousands of equivalent logic gates; complex programmable logic devices (CPLDs) that combine multiple SPLDs with programmable wiring (routing) for a higher number of equivalent logic gates such as tens to hundreds of thousands of equivalent logic gates; and field programmable gate arrays (FPGAs) with a large number of equivalent logic gates in the range of millions to tens of million of equivalent logic gates for example and into the hundreds of millions of equivalent logic gates for denser scaled future FPGA chips. A brief discussion of field programmable devices is provided in the sections that follow.
Simple Programmable Logic Devices (SPLDs)
Programmable read-only memories (PROMs) were the first chips to enable user-programmability of the bits in an array. Such chips were used to store code for system startup (BIOS), algorithms, and other functions for example. Simple logic functions can also be performed using PROMs in which address lines can be used as logic circuit inputs and data lines as outputs. However, logic functions typically do not require many product terms but a PROM contains a full decoder for its address inputs. Thus, PROMs are an inefficient architecture for programmable logic function and are rarely used for this purpose and are therefore not included in block diagram 100.
The first SPLD device developed for implementing a field-programmable logic array (FPLA) or PLA for short consisted of two arrays for storing two levels of equivalent logic gates. A first AND array (or AND-plane) is structured such that any of the AND array inputs or complements of the inputs can be AND'ed together and each AND-array output corresponds to any product term of inputs to the AND array. These product term outputs of the AND array become inputs to a second OR array. OR array outputs can be configured to produce any logical sum of any of the product terms (AND-array outputs) and implements logic functions in sum-of-products form. The PLA architecture is far better for generating logic functions than a PROM because both the AND and OR array terms can have many inputs.
FIG. 2 illustrates a schematic of PLA 200 including programmable AND array 210 and programmable OR array 220. Inputs 225 to input drivers 230 result in logic functions A, B·C, . . . , DC logic inputs to programmable AND array 210. Programmable AND array 210 forms product terms based on the inputs and on the state of nonvolatile bits at the intersection of input lines A, B·C, . . . , DC and provides product terms PT1, PT2, . . . , PTM as inputs to Programmable OR array 220. Programmable OR array 220 forms sum-of-products (or product terms) outputs O1, O2, . . . , ON based on product terms inputs and the state of nonvolatile bits at the intersection of product terms PT1, PT2, . . . , PTM and OR array output lines O1, O2, . . . , ON, which are sent to output drivers 240. Output drivers 240 may be conventional drivers, or may include additional logic function such as XOR and may also include flip flops such as D-flip flops for example. Output drivers 240 drive outputs 245 which is the logic response to inputs 225 based on the ON or OFF bit states of individual nonvolatile bits in the AND and OR arrays. Also, output driver 240 drives feedback loop 250 which supplies output logic response to input drivers 230. Note that some of the output lines 245 may be included in feedback loop 250.
In operation, inputs 225 of PLA 200 result in logic outputs 245 based on the ON and OFF states of devices, such as EPROMs for example, located at the intersection of input lines such as A, B·C, . . . , DC and product term lines PT1, PT2, . . . , PTM in electrically programmable AND array 210 and the intersection of PT1, PT2, . . . , PTM and outputs O1, O2, . . . , ON in programmable OR array 220. Details of PLA operation are well known in the literature, for example, C. Mead and Lynn Conway, “Introduction to VLSI Systems,” Addison-Wesley Publishing Company, 1980, pages 79-82.
PLAs such as PLA 200 described further above are the earliest examples of simple SPLDs introduced in the early 1970's. PLAs using mask programmable AND arrays, OR arrays, and feedback loops in a fabricator were successfully used by IBM in many applications for over a decade. However, for field programmable PLAs with two memory arrays (memory planes) requiring electrically programmable AND and OR arrays, field programmable PLAs were difficult to manufacture and introduced significant propagation delays. To address these problems, simpler programmable array logic (PAL) devices were developed which use a programmable AND array to realize product terms and then provide said product terms to fixed (non-programmable) OR-gates. To compensate for the loss of OR array flexibility, product variations were introduced with different number of inputs and outputs and various sizes of OR-gates. Field programmable PALs were widely used in digital hardware immediately after their introduction and form the basis for more recent and more sophisticated architectures. All small programmable logic devices (PLDs) such PLAs and PALs are grouped together and referred to as simple field programmable devices (SPDLs) and are typically low cost with high pin-to-pin speed performance as illustrated by block diagram 100 in FIG. 1.
FIG. 3 illustrates PAL 300 schematic implementation with an electrically programmable AND array 310 that includes nonvolatile nodes 320 and 325 programmed to an ON state, wherein essentially orthogonal programmable AND array lines are electrically coupled, or fused, together (said electrical coupling indicated by an open circle). Intersections of essentially orthogonal programmable AND array lines without circles are in a nonvolatile OFF state, wherein said lines are electrically isolated. Programmable AND array 310 may be formed using one-time-programmable EPROM devices for example. Programmable AND array 310 may be programmed once in the field. If the logic function needs to be changed, a new PAL chip is programmed in the field.
PAL 300 inputs A and B form column logic inputs A, AC, B, and BC to programmable AND array 310, where AC indicates the complement of logic variable A and BC indicates the complement of logic variable B. In this specification, the complement of a logic variable such as logic variable A may be indicated symbolically by AC or A′. Both symbolical representations for the complement of a logic variable are used interchangeably throughout the specification. Feedback loop 330 provides inputs C and D which form programmable AND array column logic inputs C, CC, D, and DC. Product terms 335-1 and 335-2 form two outputs of programmable AND array 310 and provide inputs to OR logic gate 340. The OR logic gates are not programmable. Product terms 335-3 and 335-4 form another two outputs of programmable AND array 310 and provide inputs to OR logic gate 345. OR-gate 340 provides a sum-of-products (or sum-of-product-terms) output to the input of D-flip flop 350 and OR-gate 345 provides a sum-of-products output to the input of D-flip flop 355. D-flip flop 350 provides output O1 which is connected to input C by feedback loop 330 and D-flip flop 355 provides output O2 which is connected to input D by feedback loop 330.
In operation, inputs A and B to PAL 300 result in logic outputs O1 and O2 based on the ON and OFF states of devices, such as EPROMs for example, located at the intersection of input lines and product term lines in electrically programmable AND array 310. Details of PAL operation are well known in the literature and are available in product specifications.
Complex Programmable Logic Devices (CPLDs)
CPLDs consist of multiple SPLD-like blocks interconnected on a single chip, typically by a programmable global interconnect matrix resulting in a field programmable logic function that is substantially more powerful than is possible with even large individual SPLD functions and represents a category of programmable logic devices (PLDs) as shown in FIG. 1. The difficulty of increasing capacity of a single SPLD architecture is that the array size of the programmable logic-arrays are driven to large dimensions as the number of inputs increase. Therefore as technologies are scaled to smaller dimensions and the number of transistors available on chips increases, it becomes more efficient to limit the size of SPLDs and to interconnect multiple SPLDs with a programmable global interconnect matrix.
FIG. 4 illustrates a schematic of CPLD 400 architecture formed using four SPLD functions, SPLD 410, SPLD 420, SPLD 430, and SPLD 440. In one implementation, for example, electronically programmable SPLD functions may be formed using electronically programmable PALs similar to PAL 300 illustrated in FIG. 3. While four interconnected electronically programmable SPLD functions are illustrated in FIG. 4, dozens of interconnected SPLDs may be used to form a large flexible in-circuit programmable logic function. All connections between SPLDs, in this example PALs similar to PAL 300 described further above with respect to FIG. 3, are routed (wired) through global interconnect matrix 450.
In operation, all communication between SPLD 410 and all other SPLDs used to form CPLD 400 are routed to global interconnect matrix 450 using wire(s) 410-1 and received from global interconnect matrix 450 using wire(s) 410-2. All communication between SPLD 420 and all other SPLDs used to form CPLD 400 flow are routed to global interconnect matrix 450 using wire(s) 420-1 and received from global interconnect matrix 450 using wire(s) 420-2. All communication between SPLD 430 and all other SPLDs used to form CPLD 400 flow are routed to global interconnect matrix 450 using wire(s) 430-1 and received from global interconnect matrix 450 using wire(s) 430-2. And all communication between SPLD 440 and all other SPLDs used to form CPLD 400 flow are routed to global interconnect matrix 450 using wire(s) 440-1 and received from global interconnect matrix 450 using wire(s) 440-2. Multiple inputs and outputs (I/Os) communicate between CPLD 400 and other circuit functions. Since all connections are routed through similar paths, time delays can be predicted which simplifies CPLD design. Buffer circuits (not shown) may be used as well.
Applications that can exploit wide AND/OR gates and do not require a large number of flip flops are good candidates for mapping into CPLDs. Control functions such as graphics controllers and some communication circuit functions map well into CPLD architectures. In-system re-programmability and reasonably predictable speed performance are significant advantages offered by CPLDs.
Field Programmable Gate Array (FPGA) Logic
FPGAs were invented by Ross Freeman, cofounder of the Xilinx Corporation, in 1984 to overcome the limitations of CPLDs. The primary differences between CPLDs and FPGAs are due to differences in chip architecture. As described further above, CPLD architecture consists primarily of programmable sum-of-products logic arrays with a relatively small number of clocked registers (D-flip flops for example) interconnected by a global interconnect matrix as illustrated further above by CPLD 400 shown in FIG. 4. CPLDs typically have relatively high logic-to-interconnect ratios. The result is less architectural flexibility and smaller logic functions (typically limited to tens to hundreds of thousands of equivalent logic gates) but more predictable timing delays and greater ease of programming.
FPGA architectures are dominated by interconnects. FPGAs are therefore much more flexible in terms of the range of designs that can be implemented and logic functions in the millions and tens of millions and eventually in the hundreds of millions of equivalent logic gates may be realized. In addition, the added flexibility enables inclusion of higher-level embedded functions such adders, multipliers, CPUs, and memory. The added interconnect (routing) flexibility of FPGAs also enables partial reconfiguration such that one portion of an FPGA chip may be reprogrammed while other portions are running. FPGAs that can be reprogrammed while running may enable reconfigurable computing (reconfigurable systems) that reconfigure chip architecture to better implement logic tasks. The FPGA's flexibility, ability to support a large number of equivalent logic gates, and ability to accommodate embedded memory and logic functions are displacing ASICs in many applications because of lower non-recurring engineering (NRE) design costs and faster time-to-market. FPGA architecture is shown in FIG. 1 alongside SPLD and CPLD as a stand-alone category of programmable logic device architecture.
FPGA architecture and circuit implementations are described in U.S. Pat. No. Re. 34,363 to Freeman, filed on Jun. 24, 1991, and SRAM memory controlled routing switch circuit implementations are described in U.S. Pat. No. 4,670,749 to Freeman, filed on Apr. 13, 1984, the contents of which are incorporated herein by reference in their entirety. FPGA 500 (as shown in FIG. 5) schematically illustrates basic concepts taught by Freeman in the above referenced patents by Freeman.
Referring now to FIG. 5, FPGA 500 includes an array of configurable (programmable) logic blocks (CLBs) such as CLB 510 and programmable switch matrices (PSMs) such as PSM 520. Interconnections between CLBs and PSMs may be relatively short to provide local wiring (such as interconnect 530) or relatively long to provide global wiring (not shown). A programmable switch (routing) matrix PSM1 interconnecting four CLB blocks CLB1, CLB2, CLB3, and CLB4 is illustrated in FIG. 5. In this example, switch 540, one of the switches in PSM1, may be used to interconnect CLB1, CLB2, CLB3, and CLB4 in any combination.
CLBs are typically formed by combining look up tables (LUTs) with flip flops and multiplexers as illustrated schematically by CLB 600 in FIG. 6. Alternatively, CLBs may be formed by combining combinatorial logic with flip flops and multiplexers as illustrated by CLB 700 in FIG. 7.
Referring now to FIG. 6, CLB 600 comprises LUT 610 with inputs I1, I2, . . . , IN. LUT 610 may be a random access memory (RAM) such as an SRAM, an EPROM, an EEPROM, or a flash memory. A typical LUT configuration may be a RAM organized in a 4×4×1 configuration with four inputs and one output. In this example, the LUT 610 output drives the input of clocked D-flip flop 620 which in turn drives an input of multiplexer (MUX) 630. The LUT 610 output may also drive an input of MUX 630 directly. MUX 630 drives (provides) CLB 600 output to terminal O.
Referring now to FIG. 7, CLB 700 includes configurable combinatorial logic function 710 with inputs I1, I2, . . . , IN. Configurable combinatorial logic function 710 may be formed using cascaded transfer devices or random logic blocks such as NAND and NOR functions for example. Configurable combinatorial logic function 710 formed using NanoLogic™ functions may also be used as described further below in FIGS. 12 and 14. Typical configurable combinatorial logic function 710 may be formed using cascaded transfer devices and configuration control bits or random logic blocks and configuration control bits. In this example, the configurable combinatorial logic function 710 output drives the input of clocked D-flip flop 720 which in turn drives an input of MUX 730. The configurable combinatorial logic function 710 output may also drive an input of MUX 730 directly. MUX 730 drives (provides) CLB 700 output to terminal O.
The routing flexibility of FPGAs enables a wide variety of functions to be realized. FIG. 8 illustrates FPGA 800 and shows an example of a static ram (SRAM) controlled routing of signals between various CLBs enabling an in-circuit programmable logic function. CLB 810 includes an AND gate with inputs I1 and I2 and an output O1 which is provided to PSM 812 which includes FET 815 whose ON or OFF states are controlled by SRAM 820. FET 815 terminal 1 is connected to output O1, gate terminal 2 is connected to SRAM 820, and terminal 3 is connected to wire 825. Wire 825 is in turn connected to PSM 828 which includes FET 830 whose ON and OFF states are controlled by SRAM 820. FET 830 terminal 4 is connected to wire 825, gate terminal 5 is connected to SRAM 820, and terminal 6 is connected to wiring 835. Wiring 835 is also connected to an input of MUX 840 which is controlled by SRAM 820. Output O2 of MUX 840 is connected to wire 850 which is connected to an input of an AND gate in CLB 855 providing an output O3. A global wire 860 is shown which is not part of local wiring.
In operation, output O1 is applied to terminal 1 of FET 815 with the logic state (high or low voltage) of gate terminal 2 controlled by SRAM 820. If FET 815 is OFF, low gate voltage in this example, then O1 does not propagate along wire 825. If however, FET 815 is ON, high gate voltage (typically 2.5 volts) in this example, then O1 propagates through the channel region of FET 815 to terminal 3, and then along wire 825 to terminal 4 of FET 830 which is also controlled by SRAM 820. If FET 830 is in an OFF state, then O1 does not propagate to terminal 5. However, if FET 830 is in an ON state, then O1 propagates along wire 835 to an input terminal of MUX 840. If MUX 840 is enabled by SRAM 820, then MUX output O2 is applied to an input terminal of the AND gate in CLB 855 by wire 850. The AND gate output O3 is also the output of CLB 855.
The use of SRAMs to control wiring in FPGAs as illustrated above with respect to FIG. 8 and described in U.S. Pat. No. 4,670,749 has the advantage of compatibility with leading edge CMOS logic processes, is reprogrammable, and supports in-circuit programmability. However, it is the largest area element having 5 to 6 transistors per cell, requires external loading of bits to define the logic function. Further, in such SRAM based designs the FPGA is nonfunctional until loading is complete, is volatile, and has relatively low radiation tolerance. In addition, the large SRAM cell size also requires a large number of wiring layers and impacts architecture because the size of the switch is a key factor in determining FPGA architecture.
A very small switch such as a cross point antifuse may also be used for wiring. Such a small switch results in a different architecture and can reduce chip size by approximately 10× relative to an SRAM-based FPGA implementation. A cross point antifuse is nonvolatile, has very low capacitance (1 fF per node for example), is radiation hard, and does not require external loading of bits to operate. However, programming such antifuse based FPGA devices (such as is depicted in FIG. 9) requires relatively high voltages such as 5 to 10 volts to ensure breakdown and currents in the 5 to 10 mA range. Further such devices are one-time-programmable (OTP) and are difficult to in-circuit program.
FIG. 9 illustrates a schematic of FPGA 900 which includes logic cells such as logic cell 910, vertical wiring 920, horizontal wiring 930, and antifuses such as antifuse 940 at each intersection of vertical and horizontal wires. Such antifuses are typically formed using ONO dielectric-based antifuses or metal-to-metal antifuses. While wiring is showed in channel regions between logic cells, wiring over logic cells (not shown) may be used to further increase density. I/O circuits such as I/O 950 interface internal to FPGA 900 circuits and with output connections on the chip. FPGA 900 with dense wiring is somewhat similar to ASIC-type layouts although antifuse ON resistance may be in range of 25 ohms to several hundred ohms depending on antifuses used. Also, high voltage circuits (not shown) are included to switch selected cross point antifuse switches from an OFF to an ON state.
In operation, high voltages typically in the 5-10 volt range with high currents in the milliampere range are used to program (change) the cross point antifuses from an OFF-to-ON state. Then the logic function can be tested. A new chip is required for each logic function and OTP in-circuit programming is difficult. A socket approach can facilitate programming of FPGA 900.