Programmable logic devices (PLDs) typically include a plurality of logic elements and associated interconnect resources that are programmed by a user to implement user-defined logic operations (that is, a user's circuit). PLDs are programmed using a personal computer or workstation, appropriate software and a device programmer. Therefore, unlike application specific integrated circuits (ASICs) that require a protracted layout process and an expensive fabrication process to implement a user's logic operation, a PLD may be utilized to implement the logic operation in a relatively quick and inexpensive manner.
FIG. 1(A) shows a portion of a field programmable gate array (FPGA) 100, which is one type of PLD. Although greatly simplified, FPGA 100 is generally consistent with XC3000.TM. series FPGAs, which are produced by Xilinx, Inc. of San Jose, Calif. FPGA 100 includes an array of configurable logic blocks CLB-1 through CLB-8, and programmable interconnect resources which include interconnect lines 120 extending between the rows and columns of CLBs. Each CLB includes configurable combinational circuitry and optional output registers, and conductive wires 110 which extend from the CLB for selective connection to the interconnect lines 120. All of the CLBs of an FPGA are typically identical. Each interconnect line 120 includes a series of wiring segments 121 that span the length of one CLB and are programmably coupled at their respective ends via programmable multi-way segment-to-segment switches 122. As shown in FIG. 1(B), each multi-way segment-to-segment switch 122 selectively connects a wiring segment 121(1) to any of three adjacent segments 121(2), 121(3) and 121(4) via pass transistors that are controlled by memory cells (not shown). In addition, each horizontal wiring segment 121 is connectable to the conductive wires 110 of associated CLBs via segment-to-CLB input switches 123. As shown in FIG. 1(C), each segment-to-CLB input switch 123 includes a multiplexer (MUX) 124 having inputs connected to wiring segments 121(5) through 121(7) through buffers, and an output that is connected to CLB input wire 110. A memory device (not shown) transmits control signals on select lines 125 such that the MUX 124 passes a signal from one of the wiring segments 121(5) through 121(7) to the associated CLB. Finally, signals are output from each CLB on lines 115 to a vertical wiring segment 126 via CLB-to-segment output switches 127 (these switches may also be used to output signals to horizontal wiring segments 121). As shown in FIG. 1(D), each output switch 127 includes a series of pass transistors controlled by memory devices (not shown) that direct an output signal onto one of the vertical segments 126(1), 126(2) or 126(3).
The PLD programming processes and fault tolerance methods discussed below are specifically directed to FPGAs incorporating the general structure described above.
FIG. 2(A) shows a block diagram generally illustrating a system for programming a PROM 150 which subsequently configures an FPGA 100 (see FIG. 2(B)) to operate in accordance with a user's logic operation. The system includes a computer 200 and a programmer 210. The computer 200 has a memory 201 for storing software tools referred to herein as place-and-route software 202 and device programming software 203. The memory 201 also stores a computer-readable FPGA description 204, which identifies all of the interconnect resources and CLBs of the target FPGA 100.
The FPGA programming process typically begins when a user 230 enters a logic operation 205 into memory 201 via an input device 220 (for example, a keyboard, mouse or a floppy disk drive). The user then instructs the place-and-route software 202 to generate configuration data 206 (place-and-route solution) which, when entered into the target FPGA 100, programs the FPGA 100 to implement the logic operation. The place-and-route software begins this process by accessing the FPGA description 204 and the logic operation 205. The place-and-route software 202 then divides the logic operation 205 into inter-related logic portions that can be implemented in the individual CLBs of the target FPGA 100 based on the description provided in the FPGA description 204. The place-and-route software 202 then assigns the logic portions to specific CLBs of the FPGA description 204. Routing data is then generated by identifying specific interconnect resources of the FPGA description 204 that can be linked to form the necessary signal paths between the inter-related logic portions assigned to the CLBs in a manner consistent with the logic operation 205. The placement and routing data is then combined to form configuration data 206. If the place-and-route software 202 fails to generate, for example, a routing solution for the target FPGA 100, then the user may repeat the place-and-route operation using a different FPGA (e.g., one having more interconnect resources). Place-and-route operations and associated software are well known by those in the FPGA field.
After the place-and-route software 202 generates the configuration data 206, the configuration data is loaded into a PROM 150 via the programming software 203 and the programmer 210. After programming, the PROM 150 is connected to the target FPGA 100, as shown in FIG. 2(B). When power is subsequently applied to the target FPGA 100, the configuration logic of the FPGA 100 reads the configuration data from PROM 150, thereby configuring the FPGA 100 to implement the logic operation.
Fault tolerance in FPGAs is desirable for FPGA manufacturers because it typically increases the yield of usable FPGAs.
One approach to providing fault tolerance in FPGAs is to relocate the logic portion assigned to a defective CLB into a spare or other unused CLB, and to reroute signals from the defective CLB to the spare/unused CLB. A problem with this approach is that a user typically does not know which CLB of a particular FPGA will be defective, and usually generates configuration data intended for a defect-free device. If the user then attempts to program an FPGA 100 containing a defective CLB, the programmed FPGA will not perform as intended. Therefore, the user must either discard the FPGA, or the user must repeat the place-and-route process with a modified device description indicating the location of the defective CLB. In other words, the user must generate new configuration data that does not assign a logic portion to the defective CLB. Because each target FPGA may have a defective CLB in a different location, this approach potentially requires different configuration data for every device implementing the user's logic operation. This approach puts a heavy burden on the user who must potentially repeat the place-and-route operation for each FPGA and supply many PROM configurations.
A second approach to providing fault tolerance in FPGAs is disclosed in U.S. Pat. No. 5,434,514 (Cliff et al.). According to this approach, a spare group (row or column) of CLBs and associated interconnect resources is provided in an FPGA for use if the FPGA contains a defective CLB. This approach makes fault tolerance transparent to the user by also providing extra wiring and factory-configured switches on the interconnect lines to direct signals around the row containing the faulty CLB while maintaining the same logical routing configuration. To reconfigure around a row containing a faulty logic element, fuses are burned at the factory that shift connections from the faulty row to an adjacent non-faulty row, whose connections are also shifted, and so on until the spare row is utilized. A problem with this approach is that, for the faulty row to be transparent (i.e., capable of utilizing the user's "fault-free device" configuration data), it is necessary to maintain the original connectivity between the rows on either side of the faulty row. This connectivity requires extra wiring resources that greatly increase the area overhead of the FPGA, thereby making this solution undesirable.
A third approach to implementing redundant circuitry in an FPGA is taught in "Node-Covering Based Defect and Fault Tolerance Methods for Increased Yield in FPGAs", by Fran Hanchek and Shantanu Dutt, Proceedings of the Ninth International Conference on VLSI Design, January 1996. This approach utilizes the principle of "node-covering" that, unlike the first and second approaches, is implemented in the place-and-route software and the device programming software. That is, during the place-and-route process each primary CLB (node) in the FPGA is assigned a cover (i.e., replacement) CLB that can be reconfigured to replace the primary CLB if the primary CLB is defective in a particular target FPGA. The cover CLB of a particular primary CLB may be an unused CLB or another primary CLB, thereby forming a chain of primary/cover CLBs. A spare row (or column) of CLBs is provided for covering the last primary CLB in the row (or column). FPGA routing software 202 reserves the segments 121 required to complete routes to the covering CLBs, thereby avoiding the need to connect all CLBs identically. If a defective primary CLB is identified in a target FPGA, the target FPGA can be reconfigured by the device programming software using a simple process such that the defective CLB is replaced by its cover CLB, which in turn is replaced by its own cover, and so on until a spare CLB in the chain is reached. The reserved segments are used to connect segments to the covering CLBs.
For a given FPGA, placement and routing solutions obtained by the node-covering method are able to tolerate one faulty CLB in each row. In order for a cover CLB to replace a primary CLB, two conditions must be met. First, the cover CLB must be able to duplicate the functionality of the primary CLB. This is easy in a typical FPGA, because all of the CLBs are identical. Configuration data designated for the primary CLB is simply transposed to the cover CLB. Second, the cover CLB must be able to duplicate the connectivity of the primary CLB with respect to the rest of the CLBs. This requires that the associated routing solution reserve certain interconnect line segments for maintaining the signal paths when configuration data is shifted from a defective primary CLB to its cover CLB along the row. This concept is better understood with reference to FIG. 1(A), in which CLB-4 and CLB-8 are designated as spare CLBs in their respective rows. The segment-to-CLB connections are associated with the CLB configurations. Therefore, when primary CLB configuration data is transposed to a cover CLB, all of the segment-to-CLB connection data is transposed as well.
FIGS. 3(A) and 3(B) illustrate the fault tolerance method utilizing this node-covering principle.
FIG. 3(A) is a graphic representation showing a primary configuration in which CLB configuration data portions A through H are respectively stored in primary CLBs CLB-1 through CLB-4, and CLB-6 through CLB-9. No configuration data is stored in CLB-5 and CLB-10, which are reserved as "spare" CLBs. Each CLB is "covered" by an immediately adjacent CLB. Specifically, CLB-1 through CLB-4 are respectively covered by CLB-2 through CLB-5, and CLB-6 through CLB-9 are respectively covered by CLB-7 through CLB-10.
FIG. 3(B) shows an alternative configuration in which CLB-1 and/or CLB-6 is defective. CLB configuration data portions A through H are shifted to the "cover" CLBs (CLB-2 through CLB-5 and CLB-7 through CLB-10) to avoid two defective primary CLBs (CLB-1 and CLB-6). To meet the connectivity requirement, each signal path programmably coupled to a primary CLB through a primary segment must also include a corresponding cover segment for transmitting signals to the cover CLB. Cover segments are incorporated into the routing solution in one of two ways. First, cover segments may be adjacent primary segments associated with adjacent primary CLBs. For example, as shown in FIG. 3(A), primary segment 121(1) programmably coupled to CLB-1 is covered by primary segment 121(2) programmably coupled to CLB-2. Alternatively, additional reserved segments are incorporated into the routing solution. For example, the primary segment 121(2) programmably coupled to CLB-2 is covered by reserved segment 121(3) (indicated by a broken line) that is programmably coupled to CLB-3. Therefore, with fault tolerant CLB groups defined along rows, covering a segment-to-CLB connection to a horizontal interconnect segment requires one reserved segment.
A problem associated with the spare node method is that the reserved segments constitute unusable portions of the FPGA in the primary routing solution, thereby reducing the effective size of the FPGA. Hanchek and Dutt (supra) estimate this overhead to be approximately 30%. In effect, a routing solution in accordance with the spare node method that "fully" utilizes the interconnect resources of an FPGA actually uses only 70% of the available interconnect segments. As a result, some logic operations that would otherwise route on a particular FPGA using conventional routing methods may not route using the spare node method. When this occurs, a user must either utilize a "normal" place-and-route process (which then precludes fault tolerance), or purchase a larger (and typically more expensive) FPGA to implement the user-defined logic operation.
A second problem with the spare node solution is the signal delay caused when signals pass through the additional segment-to-segment switches when a spare segment is utilized. For example, configuration data B is placed in CLB-2 in the primary placement solution (FIG. 3(A)), and is relocated in CLB-3 in the alternative solution (FIG. 3(B)) if CLB-1 is defective. Referring to FIG. 3(A), a signal passed along the primary signal path from segment 121(1) to segment 121(2) to CLB-2 encounters a switching delay caused by segment-to-segment switch 122(1). However, in the alternative solution, the same signal directed to CLB-3 must pass through a second segment-to-segment switch 122(2) before reaching segment 121(3), thereby changing the timing specification of the signal path. This delay may cause problems in an actual FPGA if the timing specifications for a particular logic function are particularly tight. Further, in practice, an FPGA vendor can only guarantee the slower timing of the alternative solution, thereby making the FPGA appear slower to customers.
Therefore, a need arises for a fault tolerance method that requires little overhead, and provides good guaranteed performance.