The present invention relates generally to the field of integrated circuit devices and, more particularly, to on-line fault tolerant operation of a field programmable gate array.
A field programmable gate array (FPGA) is a type of integrated circuit consisting of an array of programmable logic blocks interconnected by a programmable routing network and programmable input/output cells. Programming of the logic blocks, the routing network and the input/output cells is selectively completed to make the necessary interconnections that establish one configuration thereof to provide the desired system operation/function for a particular application.
The present inventors have recently developed methods of built-in self-testing the array of programmable logic blocks and the programmable routing network in FPGAs at the device, board and system levels. These methods are set out in detail in U.S. Pat. Nos. 5,991,907 and 6,003,150 and pending U.S. application Ser. Nos. 09/059,552 and 09/109,123. The full disclosures in these patent applications are incorporated herein by reference.
In addition to these off-line testing methods, the present inventors have also recently developed methods of testing and fault tolerant operation of the programmable logic blocks and the programmable interconnect network of FPGAs. These methods are set out in detail in pending U.S. application Ser. Nos. 09/261,776, 09/405,958, 09/406,219 and U.S. Provisional Application No. 60/156,189. The full disclosures in these patent applications are also incorporated herein by reference.
Fault tolerant operation of FPGAs is most important in high-reliability and high-availability applications, such as, long-life space missions, telecommunication network routers, or remote equipment in which adaptive computing systems often rely on reconfigurable hardware to adapt system operation to environment changes. In such applications, the FPGA hardware must work continuously and simply cannot be taken off-line for testing, maintenance, or repair.
When a fault is detected in the FPGA hardware of these systems, the fault must be quickly isolated and the FPGA resources reconfigured to continue operation in a diminished capacity or to avoid the faulty resources. Therefore, testing and reconfiguring the FPGA resources, if necessary, must be performed concurrently with normal system operation in a dynamic fault tolerant manner.
In accordance with the present invention, the fault tolerant method of operating a field programmable gate array utilizing incremental reconfiguration is carried out during normal on-line operation. The FPGA resources are configured into a working area and an initial self-testing area. The working area maintains normal operation of the FPGA throughout testing. Within the initial and subsequent self-testing areas, however, the programmable logic blocks are each tested. It is initially presumed that all of the resources of the FPGA are fault-free as determined through manufacturing testing.
Within the self-testing areas, each programmable logic block including those allocated as spares is tested, preferably, exhaustively. Fault status data for each programmable logic block under test is generated and stored. In accordance with an important aspect of the present inventive method, the fault status data for each programmable logic block is used in reconfiguring the utilization of partially faulty programmable logic blocks to continue performing non-faulty system functions, or in the reconfiguration of the FPGA resources to avoid faulty programmable logic blocks altogether. By reconfiguring the utilization of partially faulty programmable logic blocks, these blocks are allowed to continue to operate in a diminished, although acceptable, capacity for specific operating modes.
Upon completion of testing of each of the programmable logic blocks located within the initial self-testing area, the FPGA is reconfigured so that a portion of the working area becomes a subsequent self-testing area, and the initial self-testing area becomes a portion of the working area. In other words, the self-testing area roves around the FPGA repeating the steps of testing the programmable logic blocks and reconfiguring the FPGA until each portion of the working area, or the entire FPGA, is reconfigured as a subsequent self-testing area, tested, and the utilization of the respective programmable logic blocks reconfigured, if need be. Advantageously, normal operation of the FPGA continues within the working area throughout testing and is uninterrupted by the testing conducted within the self-testing areas.
Prior to relocating the initial and subsequent self-testing areas, the FPGA is incrementally reconfigured to advantageously place spare programmable logic blocks. In accordance with the present inventive method, a portion of the working area programmable logic blocks are initially allocated as spares having fault tolerant replacement configurations predetermined or precompiled before the mission starts for each associated programmable logic block. Preferably, the initially allocated spares and their associated precompiled replacement configurations are utilized first in the incremental reconfiguration process, if possible.
If the number or location of unusable faulty programmable logic blocks is such that the initially allocated spares cannot replace the faulty programmable logic blocks, then a subsequent portion of the working area programmable logic blocks are allocated as spares and new fault tolerant replacement configurations computed. Although the initial precomputed replacement configurations are preferred over the new replacement configurations due to time considerations, the additional time required to reallocate subsequent spares and compute new replacement configurations does not significantly interfere with the continuing operation of the FPGA. As noted above, normal operation of the FPGA continues within the working area throughout testing.
As the self-testing area roves around the FPGA repeating the steps of testing and incrementally reconfiguring the utilization of programmable logic blocks, the number of available spare programmable logic blocks will inevitably diminish over time as unusable faulty programmable logic blocks are replaced. In accordance with another important aspect of the present invention, additional spare programmable logic blocks are at some point necessarily removed from the self-testing area. In this manner, the testing and roving capabilities of the self-testing area are also inevitably diminished. Eventually, roving, testing, and even operation of the FPGA will cease. Advantageously, however, the present inventive method of on-line fault tolerant operation utilizing incremental reconfiguration delays the need for real-time replacement configurations, and subsequently the use of programmable logic blocks from the self-testing area as spares as long as possible while allowing normal operation of the FPGA to continue within the working area.