Field-Programmable Gate Arrays (FPGAs) are integrated circuits designed to be configurable after manufacturing, whether by an end user or by other parties to the design process. They offer significant advantages over general purpose processing units in the ability to offer optimised circuit design for particular purposes, such as financial modeling or signal processing. Optimised circuit design can provide orders of magnitude increases in performance and power efficiency. At the same time, they offer a more flexible solution than fixed-function Application Specific Integrated Circuits (ASICs) due to their ability to be configured after manufacture.
The configuration of an FPGA is typically captured in a hardware description language (HDL) code stored on a configuration memory (CM). Compared to an ASIC, a single FPGA device can offer multiple functionalities depending on the configuration stored in its CM.
In practice, this ability to avoid constructing hardware fabrication facilities while providing chips with particular functionality is often the principal drive behind the adoption of FPGA technology. Once in use, the FPGA often functions as a static circuit, operating according to the single configuration stored in an associated CM.
While this approach does bring cost advantages at the manufacturing stage, to the end user the effect is a limited integrated circuit with little or no additional flexibility beyond that provided by an ASIC. If the FPGA is intended for use in a range of tasks, then its configuration must implement all possible operations statically, leading to redundancy since not all of these are used all the time, so some resources can become idle some of the time. As the range of available operations increases, the advantages over general purpose processing units rapidly disappear. In essence, the hardware configuration is no longer optimised to a particular task but instead carries the overheads present in general purpose processing units.
Thus a device with a single configuration is an inefficient approach to providing a dynamic range of processing capabilities. Run-time reconfiguration techniques are used to separate a single configuration into multiple efficient configurations, when a dynamic range of tasks are required. For example, in “Automating Elimination of Idle Functions by Run-Time Reconfiguration”, FCCM 2013, pp. 97-104, application functions used at different time are separated into different configurations, to free resources occupied by idle resources. The major challenge of run-time reconfiguration techniques is the reconfiguration time, which is the time to download a configuration into a reconfigurable device before it can be used. To reduce the reconfiguration time, partial reconfiguration techniques can be applied to only update the regions that are different in successive configurations. In “An area-efficient partially reconfigurable crossbar switch with low reconfiguration delay,” FPL, 2012, pp. 400-406 and “Staticroute: A novel router for the dynamic partial reconfiguration of FPGAs,” FPL, 2013, pp. 1-7, similarities in configurations are exploited to further reduce the partial reconfiguration time. However, run-time reconfiguration techniques, for either full reconfiguration or partial reconfiguration, are still limited by the reconfiguration time, since off-chip configurations need to be loaded onto a reconfigurable device during runtime. As an example, in “A high I/O reconfigurable crossbar switch,” FCCM, 2003, pp. 3-10, it takes 220 us to reconfigure a crossbar running at 150 MHz. For applications where dynamic tasks change frequently, the reconfiguration time can outweigh the benefits gained from separating the dynamic tasks. Also, as reconfigurable chips such as FPGAs get larger, the time required to fully reconfigure a chip becomes longer.
Since the configuration of an FPGA is dependent upon the CM, an alternative approach has been proposed in which the FPGA is coupled to a number of different CMs. In this approach, the appropriate CM (and thus configuration) is selected according to the required task (e.g. on the basis of one or more run time variables). Accordingly, under different run-time scenarios, only the required operators are implemented.
U.S. Pat. No. 5,426,378 to Randy T. Ong entitled “Programmable Logic Device Which Stores More Than One Configuration and Means for Switching Configurations” describes a programmable logic device with a CM expanded to store two or more complete sets of configuration data. During runtime, outputs from the expanded CM sets are selected. The reconfiguration operation can be finished within a user's clock cycle.
U.S. Pat. No. 6,829,756 B1 to Stephen M. Trimberger entitled “Programmable Logic Device with Time-Multiplexed Interconnect” describes a programmable logic device with interconnect configured with expanded CM sets. The expanded CM sets enable an interconnect to be reconfigured within a cycle. Therefore multiple resources can use the same interconnect at different time, which reduces redundant interconnections.
U.S. Pat. No. 8,664,974 B2 to Rohe et al. entitled “Operational Time Extension” describes an integrated circuit containing reconfigurable circuits configured with multiple CM sets, each named a time-extending reconfigurable circuit. During runtime, multiple time-extending reconfigurable circuits construct a signal path. Each time-extending reconfigurable circuit maintains one of its CM sets over at least two contiguous cycles, to allow signals to propagate through the signal path within a desired amount of time.
However, in order to provide a large number of configurations stored on chip, a large number of CMs are required. This necessitates significant memory storage on chip, implying a large area and power overhead, reducing the efficiency of the overall hardware design. Moreover, the additional memory area is fixed once the FPGA is fabricated. Storing configurations off-chip has been suggested, but in this scenario reconfiguration takes many cycles to complete, resulting in unacceptable delay.
There is therefore much ongoing desire to address the inefficiencies and limitations of these designs and to provide an improved reconfigurable architecture.