This invention relates generally to programmable logic devices, and in particular to field programmable gate arrays (FPGA), either stand-alone FPGAs or xe2x80x9cembeddedxe2x80x9d FPGAs (which include a non-configurable xe2x80x9chard-corexe2x80x9d section and a dynamically reconfigurable xe2x80x9csoft-corexe2x80x9d section co-located on the same chip), in which the configurable logic blocks and the programmable routing structures are reconfigured dynamically and in which virtual redundancy is created for the purpose of fault tolerance of FPGAs.
In particular, the present invention relates to dynamically reconfigurable FPGAs where redundant circuits are created from structural elements of the FPGA unoccupied at a particular time period. In this fashion, a plurality of functionally identical duplicates of a primary circuit are created in a time multiplexing manner from structural elements of the unoccupied FPGA either by the primary circuit or by some duplicate circuits. For each application of the FPGA, the configured primary circuit and all duplicate circuits are interrogated by a voting circuit for detecting the presence of a fault, as well as for excluding a fault containing circuit (primary or any duplicate circuit) from operation.
Further, the present invention relates to a FPGA in which for fault tolerance thereof, no additional redundancy circuits need be added to the core structure of the FPGA, however, opposingly, the redundancy is created by dynamical reconfiguration of the structural elements of the FPGA in a time multiplexing manner. In this manner, identical circuits are formed, each in a respective time period, from unoccupied structural elements of the FPGA for further interrogation by a voting circuit for determining which of them is faulted and for excluding fault containing circuits from operation of the FPGA.
Also, the present invention relates to fault detection/diagnosis (isolation) in FPGAs by means of time multiplexing of a primary circuit and functional duplicates thereof configured from structural elements of FPGA in respective displaced time slots.
Programmable logic devices include an array of configurable units, and by the nature of configuration of the configurable units for specific operation, may be divided into two groups, such as non-volatile and volatile programmable logic devices. The non-volatile programmable logic devices, in order to achieve a specific configuration of configurable units, require well-known xe2x80x9cburn-inxe2x80x9d techniques, and for this purpose they employ fusible links. Typically, these devices may only be configured once and do not allow for reconfiguration in different applications.
The volatile programmable logic devices, such as field programmable gate arrays (referred herein to as FPGAs), have found wide applicability due to the flexibility provided by their reprogrammable nature. Typically, as shown in FIG. 1, FPGA 10 has two internal chip layers, such as a foundation layer 12 where a static random access memory (SRAM) is created, and an upper layer 14 in which required circuitry is formed to perform desired logic functions. The upper layer 14 includes an array of configurable logic units that are programmably interconnected each to the other where each configurable logic unit may also be individually reprogrammed to perform a number of different logic functions. The upper layer 14 also includes a configurable routing structure for interconnecting the configurable logic units in accordance with the intended circuit application. The foundation layer 12 includes a plurality of configuration memory cells which are accessible through the application configuration port 16 through which an address of a memory cell to be accessed is input, and through which data exchange with the aimed memory cell is carried out. Since each bit of the static random access memory (SRAM) of the foundation layer 12 includes a flip-flop for recording a logical xe2x80x9c1xe2x80x9d or logical xe2x80x9c0xe2x80x9d, which may be set or reset an infinite number of times, on xe2x80x9cpower downxe2x80x9d of the FPGA 10, the state of the flip-flop is typically lost, thereby making the FPGA volatile.
The configuration memory cells of the SRAM are coupled to the upper layer 14 through the configuration data channel 18 to specify the function to be performed by each configurable logic unit as well as to specify the configurable routing structure between the logic units. Once a specific configurable circuit is formed in the upper layer 14 it is fed with user input data 20 to obtain user output data 22 at the output of the FPGA 10.
As shown in FIG. 2, the SRAM in the foundation layer 12 is divided into three basic portions, (1) the logic unit configuration portion 24 which is devoted for programming the configurable logic units in the upper layer 14; (2) the VIA configuration (interconnection) portion 26 which determines the routing structure of the FPGA; and, (3) the user flip-flops, latches, and memory portion 28 which includes data storage memory cells accessible by a user during operation of the FPGA. The VIA interconnections of the FPGA are controlled by a large number of multiplexers in the upper layer 14.
Although the FPGA 10 shown in FIGS. 1 and 2 have found wide applicability, they have drawbacks, which include relatively slow speed of reconfiguration (on the order of milliseconds) that has been found to be non-satisfactory for carrying out dynamic reconfiguration techniques, also known as xe2x80x9cconfiguration on the flyxe2x80x9d, which allows the promising concept of configuring a FPGA in stages in order to propagate a specific calculation. For example, in the prior art, if a Fast Fourier Transform is to be computed, and the entire Fast Fourier Transform network cannot be fit into an FPGA chip, the Fast Fourier Transform is partitioned into stages. Then, each stage of the Fast Fourier Transform is configured into the FPGA in a time sequence, and the results from stage to stage are stored in common memory or buffer. The results of a previous stage serve as input to a next page; and this process is repeated for all of the required stages of the Fast Fourier Transform. It will be clear to those skilled in the art that for such a dynamic configuration of the FPGA, a configuration time in a microsecond range would be highly advantageous.
Such dynamically reconfigurable FPGAs capable of being reconfigured within microseconds or less, have been developed by companies Xilinx, Inc., and Altera Corporation. For example, U.S. Pat. Nos. 5,978,260 and 6,105,105 describe FPGAs in which the complete configuration of FPGAs may be accomplished in less than one microsecond. These advanced FPGAs support dynamic configuration and time multiplexing by employing memory slices, as shown in FIG. 3. Each memory slice 30, 32 and 34 contains a complete configuration of the FPGA 10 for a specific function to be performed. By rapidly switching between different memory slices 30, 32 and 34 through supply of xe2x80x9cSelect Slicexe2x80x9d data 36, the array of the configurable logic units in the upper layer 14 may be reconfigured from one application to another in a time multiplexing fashion. In the dynamically reconfigurable FPGAs shown in FIG. 3, the xe2x80x9clogic unit configurationxe2x80x9d portion 38, xe2x80x9cVIA configurationxe2x80x9d portion 40 and xe2x80x9cuser flip-flops, latching and memoryxe2x80x9d portion 42 carry data in several channels, each corresponding to an active memory slice 30, 32 or 34.
Among numbers of applications thereof, FPGAs have found their use in aerospace applications where they are subject to radiation and cosmic particles. All semiconductor chips, including FPGAs, aboard a craft are vulnerable since the adverse space environment potentially causes intermittent faults, a.k.a. xe2x80x9cSingle Event Upsetxe2x80x9d, and permanent faults. Radiation and cosmic particles from space tend to inject electronic charge into the FPGA circuitry which may change the state of bistable elements or may cause an unwanted impulse on a gate. These faults are considered Single Event Upsets and represent the majority of radiation faults. These faults may upset the circuitry but may not permanently damage hardware. On occasion, the charge is large enough to cause microheating. In this case, a permanent short or open circuit may occur.
Since the semiconductor chips aboard spacecraft are vulnerable, chips may be radiation hardened through special processing techniques, such as silicon-on-insulator. However, this technique does not completely eliminate the problem. For this reason, the redundancy is built into mission critical electronics systems as a means of further enhancing the tolerance thereof.
Fault free performance of programmable logic devices is of essence not only in aero-space, but also in terrestrial applications. For example, there exist faults common to commercial FPGAs which include metal migration faults, manufacturing faults, and faults from electrical static discharge.
Metal migration tends to cause short circuits and sometimes open circuits. It is a function of the temperature and the type of metals used in a chip, and may occur as the result of the chip exposition to excessive temperature over an extended period of time. Migration also occurs as the chip ages.
Manufacturing faults are caused by a variety of reasons, including poor grade of materials, contamination from dust particles and poor handling, etc. Although, for the most part, these faults are detected at a manufacturing facility, some of them escape screening tests.
Electric static discharge occurs when chips are mishandled by either people or machines. If this occurs, the static discharge into chip leads causes micro-heating which may damage internal conducting traces.
In general, a primary, and at least one duplicate copy, is needed for fault detection. Disagreement found in performance of the primary and duplicate circuits may manifest the presence of a fault in one of the circuits. However, it is generally impossible to judge which one of the circuits has faulted. For fault diagnosis and fault isolation, therefore, the primary and at least two duplicate circuits are needed. In this scenario, the performance of all three circuits are compared and if one of them differs it is rejected as the faulted circuit. Similarly, the primary and at least two duplicate circuits are needed for redundancy. Voting circuits interrogate all redundant circuits and the one showing a different performance is removed from operation according to a xe2x80x9cmajority principlexe2x80x9d known to those skilled in the art.
When sufficient resources are available, complete physical redundant units is the preferred means of obtaining fault tolerance. It is not uncommon to have triple or even quadruple redundancy of certain critical system elements, such as, for example, the CPU. Disadvantageously, physical redundancy is costly, and the volume, weight, power and heating constraints are all multiplied by the need for such redundancy. A typical redundancy is particularly costly in spacecraft design. Therefore, designers are selective in choosing which components require redundancy.
In those exceptional cases, when there are no costs, volume, weight, power, or thermal constraints, addition of independent duplicate FPGA chips is the preferred means of achieving redundancy. Additionally, a special chip is provided for majority voting between identical FPGA chips.
When multiple FPGA chips are not a practical alternative, in view of the above listed constraints, then a single large capacity FPGA chip is often utilized. Such a mega FPGA chip includes the Xilinx Virtex series and the Altera Apex series, which are being used for system chip design. Disadvantageously it is often not possible or extremely difficult to fit complete circuit duplicates within a single mega FPGA since even the mega FPGAs do not have sufficient resources to support complete multiple duplicates. Further, in practical design, the resources of the FPGA often fall short by a small percentage in meeting duplicate area requirements. For example, for triple redundancy, where 200% overhead is needed, the overhead may be 180% or 190%. It is clear therefore that redundancy in FPGAs created by separate FPGA chips or on one mega FPGA chip has extreme drawbacks.
Presently, NASA is developing FPGA applications for space. To help with radiation fault problems, the configuration of the FPGA is constantly read back as the FPGA is being used. This reading does not interfere with normal operation of the FPGA and will detect problems in the SRAM portion 12 of the FPGA which is generally the most sensitive area. The method does not, however, cover xe2x80x9cfaultsxe2x80x9d in the circuit section 14 of the FPGA. Additionally, this method is strictly a fault detection scheme and not a fault redundancy technique.
It is therefore highly desirable to provide a fault redundancy technique for FPGAs which would avoid physical addition of redundant circuits thus minimizing cost, volume, weight, power, and/or thermal issues associated therewith.
It is an object of the present invention to provide a fault redundancy method for FPGAs which does not require physical addition of duplicate circuits.
It is another object of the present invention to provide a fault redundancy technique in dynamically reconfigurable FPGAs in accordance with which a plurality of duplicate circuits are configured on an FPGA in time multiplexing fashion from unoccupied structural elements of the FPGA.
It is still a further object of the present invention to provide dynamically reconfigurable FPGA with means for identifying unoccupied structural elements of FPGAs for further forming duplicate circuits in a time-multiplexing fashion.
An additional object of the present invention to provide a method for creating virtual circuit redundancy in FPGAs by (a) configuring, at a first time slice, a first portion of the structural elements of the FPGA into a primary circuit for a predetermined application, (b) identifying a second portion of the structural elements having the least overlap (preferably close to zero overlap) with the structural elements of the first portion, (c) configuring, at a second time slice, these zero overlap structural element of the second portion into a first duplicate circuit functionally identical to the primary circuit, (d) identifying a third portion of the plurality of structural elements having the least overlap with the structural elements involved into the primary circuit and into the first duplicate circuit, (e) configuring the third portion of the structural element into a second duplicate circuit in a third time slice, (f) repeating the steps (b)-(e) until a required number of duplicate circuits has been created in time multiplexing fashion, and (g) comparing performance of the primary circuit and all created duplicate circuits to judge which of those circuits is fault containing in order to exclude this circuit from the operation of the FPGA.
Another object of the present invention is to provide a fault tolerance technique for existing dynamically reconfigurable FPGAs by creating duplicate circuits from the elements of the FPGAs unoccupied by any other primary or duplicate circuit for further interrogation thereof by a voting circuitry and by providing in these FPGAs means for identifying, at each time slice, the unused FPGA resources.
The present invention may find its utility in numerous fields of application where fault tolerance is required, and particularly, in aerospace applications where semiconductor FPGAs are subject to radiation and cosmic particles, as well as in commercial applications where semiconductor FPGAs may suffer from metal migration, manufacturing shortcomings, and from poor handling.
The present invention is also applicable to different types of FPGAs, including particularly, so-called xe2x80x9cembeddedxe2x80x9d chips which have non-configurable xe2x80x9chard-corexe2x80x9d section and dynamically reconfigurable xe2x80x9csoft-corexe2x80x9d section co-located on the same chip.
In accordance with the teachings of the present invention, a method for creating virtual redundancy in FPGAs is provided in accordance with which, at a first time slice, a primary group of a plurality of structural elements of the FPGAs are configured into a primary circuit for a predetermined application of the FPGA, with sequential time slices following the first time slice being defined.
In each of the following time slices, a respective duplicate group of the plurality of structural elements of the FPGA is identified which have the least spatial overlap (preferably, zero overlap) with the primary group and with any of other duplicate groups. Then, each duplicate group is configured into a duplicate circuit functionally identical to the primary circuit so that performance of the primary circuit and of the duplicate circuits may be compared in time multiplexing fashion for determining which of the circuits is faulted for further excluding the faulted circuit from the operation of the FPGA.
The structural elements of the FPGA may include configurable logic units, interconnections formed on the FPGA chip, structural elements within each logical unit, and I/O units of the FPGA.
Preferably, when identifying the unoccupied structural elements, the search is performed for (a) unoccupied configurable logic units, and further, upon exhaustion thereof, for (b) unoccupied structural elements within occupied configurable logic units.
Duplicate circuits may be arranged in different mutual spatial relationship with respect to a primary circuit, and to duplicate circuits created in other time slices. For example, a duplicate circuit may be linearly displaced from other circuits created in other time slices; or the duplicate circuit can be disposed in rotational mutual disposition with regard to other circuits created in other time slices. A requirement is that the spatial overlap between the structural elements of redundant circuits will have the least spatial overlap, preferably zero overlap in order to eliminate possible confusion in detecting which of the circuits contains a fault.
The spatial displacing of the duplicate circuits with respect to the primary circuit enhances the ability to detect and isolate a fault due to a fault multiplication effect associated with use of a given faulted structural element in different circuits, which are likely to generate different responses to the same given faulted structure.
The method of the present invention is particularly applicable to dynamically configurable FPGAs (stand-alone chips or xe2x80x9cembeddedxe2x80x9d FPGAs) in which multiple memory slices are used, each of which includes the whole configuration of the FPGA for a certain time period. In this manner, when one of the memory slices is actuated for creating a primary circuit for a predetermined application, all above discussed operations of creation of virtual redundancy are carried out in order to detect and exclude a faulted structural element from the operation.
After the entire cycle for the predetermined application is completed, the logic moves to another memory slice to create a primary circuit for another application and for creating the associated virtual redundancy. Thus, in each memory slice, a primary circuit for a specific application is created, then duplicate circuits for the primary circuit for the specific application are created in time multiplexing fashion. The voting unit interrogates the primary circuit and duplicate circuits with the purpose of excluding a fault containing circuit from operation.
Another aspect of the present invention is a FPGA having:
a plurality of configurable structural elements (which are configurable logic units, elements within configurable logic units, routing structure, and I/O units),
means for defining a plurality of time slices for each of distinct applications of the FPGA,
configuration means containing configuration data coupled to the configuration structural element to, first, configure a primary group thereof into a primary circuit in a 1st time slice, and, second, in a (1+i)th time slice to configure a duplicate group of the configurable structural element into an ith duplicate circuit functionally identical to the primary circuit, wherein i=1,2, . . .
The FPGA further includes means for selecting in each of the (1+i)th time slice the ith duplicate group of the configurable structural elements having the least overlap with structural elements occupied during 1st through ith time slices, and
a voting circuit for interrogating the primary circuit and the duplicate circuits in time multiplexing fashion for comparing performance thereof for further judging which circuit includes a fault or excluding the fault containing circuit from the operation of the FPGA.
Prior to selecting of the next group of the configurable structural elements, the identification of non-overlapping structural elements is performed, preferably, off-line on a computer work station. Once suitable structural elements are found, they are applied to the FPGA.
These and other novel features and advantages of this invention will be fully understood from the following detailed description of the accompanying Drawings.