1. Field of the Invention
This invention relates generally to programmable logic devices, and in particular, to a method of configuring a field programmable gate array for virtual hardware computation.
2. Description of the Related Art
A programmable logic device, such as a field programmable gate array (FPGA), is a well-known type of integrated circuit and is of wide applicability due to the flexibility provided by its reprogrammable nature. An FPGA typically includes an array of configurable logic blocks (CLBs), wherein each CLB is individually configured to perform any one of a number of different logic functions specified by a user (a circuit designer). A programmable interconnect routes signals between the CLBs and the input/output blocks (IOBs) (which interface between the CLBs and the device package pins) according to the desired user circuit design. The FPGA also includes configuration memory cells that are coupled to the CLBs to specify the function to be performed by each CLB, as well as to the programmable interconnect to specify the coupling of the CLBs and IOBs. The FPGA may also include data storage memory cells accessible by a user during operation of the FPGA. However, unless specified otherwise, as used herein, the term xe2x80x9cmemory cellsxe2x80x9d refers to the configuration memory cells. The xe2x80x9c1996 Programmable Logic Data Bookxe2x80x9d, published by Xilinx, Inc., pages 4-291 to 4-302, describes these configuration memory cells and an exemplary FPGA structure in greater detail, and is incorporated by reference herein.
One approach available in the prior art to increase the complexity and size of logic circuits has been coupling multiple FPGAs by external connections. However, due to the limited number of input/output connections, i.e. pins, provided on the FPGAs, not all circuits can be implemented using this approach. Moreover, using more than one FPGA undesirably increases cost and board space to implement the user circuit design. Another known approach has been increasing the number of CLBs and interconnect resources in the FPGA. However, for any given semiconductor fabrication technology, there are practical limitations to the number of CLBs and interconnect that can be fabricated on an integrated circuit. Thus, there continues to be a need to increase CLB densities for FPGAs.
Reconfiguring an FPGA to perform different logic functions at different times is known in the art. However, this reconfiguration typically requires the step of reloading a configuration bit stream for each reconfiguration. Moreover, reconfiguration of a prior art FPGA generally requires suspending the implementation of the logic functions, saving the current state of the logic functions in a device external to the FPGA, reloading the entire array of memory cells, and inputting the states of the logic functions which have been saved off-chip along with any other needed inputs. Each of these steps requires a significant amount of time, thereby rendering reconfiguration impractical for implementing typical circuits.
U.S. Pat. No. 5,646,545, incorporated herein by reference, discloses an FPGA including CLBs having both combinational and sequential logic elements, an interconnect structure for interconnecting the CLBs, and a plurality of programmable logic elements for dynamically reconfiguring the CLBs and the interconnect structure. At least one programmable logic element includes a plurality of memory cells for configuring the combinational element and at least one programmable logic element includes a plurality of memory cells for configuring the sequential logic element. A plurality of intermediate states of the CLBs and the interconnect structure are stored. In this manner, a CLB can access values calculated by CLBs (other CLBs or itself) in other configurations.
U.S. Pat. No. 5,646,545 teaches three types of FPGA data (implying three types of memory or storage): configuration data, user data, and state data. Configuration data determines the configuration of the logic blocks or interconnect when the data is provided to those logic blocks or interconnect. User data is data typically generated by the user logic and stored/ retrieved in memory that could otherwise be used for configuration data storage. State data is data defining the logical values of nodes in user logic at any specific time. Typically, state data is stored if the values at the nodes are needed at a later time. The term xe2x80x9cstatexe2x80x9d is used to refer to either all of the node values at a particular time, or subset of those values. For simplicity, user data and state data are referred to herein as xe2x80x9cuser data.xe2x80x9d
The FPGA switches between configurations (also called memory planes) by transferring bits, i.e. of configuration data and user data, from the inactive storage to the active storage, thereby allowing the FPGA to function in one of N configurations, wherein N is equal to the maximum number of memory cells assigned to each programmable point. In this manner, an FPGA with a number M of actual CLBs functions as if it includes M times N effective CLBs. Thus, assuming eight configurations, the FPGA implements eight times the amount of logic that it actually contains by including the additional configuration memory. By using this type of reconfiguration, the CLBs are reused dynamically, thereby reducing the number of physical CLBs needed to implement a given number of logic functions in a particular user""s circuit design by the factor of the number of configurations.
It is therefore a principal object of the present invention to provide a method for configuring memory planes in a dynamically reconfigurable FPGA for carrying out extremely fast computations.
It is another object of the invention to provide a method to utilize on-chip memory locations to store virtual instructions for configuration data to perform a computation in the FPGA.
It is another object of the invention to provide a method for carrying out computations in an FPGA without using external memory by employing virtual instructions to cause data stored on a memory plane of each tile of the FPGA to be copied to tile-local memory elements.
It is still another object of the invention to provide a method for providing a sequence of virtual instructions stored in the memory planes of a dynamically configured FPGA and utilizing the FPGA routing programmability to translate data stored in a first pattern of FPGA storage elements into data stored in a second pattern of FPGA storage elements.
In one embodiment of the present invention, a dynamically reconfigurable FPGA includes an array of tiles wherein each tile has a local memory. The memory address for this local memory is defined only within the tile. In one embodiment, the local memory includes memory cells and micro registers. The present invention uses this local memory to pass large amounts of configuration and user data from one FPGA configuration (memory plane) to another with no external memory access, thereby transferring data to/from the memory cells at very high speed. Typically, all the local memory can be simultaneously transferred to/from other memory planes in one cycle. In accordance with the present invention, each FPGA configuration provides a virtual instruction.
The present invention uses two different types of virtual instructions: computational instructions and pattern manipulation instructions. Computational instructions perform some computation with user data stored in some well-defined local memory pattern. Pattern manipulation instructions move the local data into different memory locations to create the pattern required by the next instruction. A virtual computation may be accomplished by a sequence of instructions that work with pre-defined input and output patterns.
In accordance with one embodiment, a set of standard memory transfer patterns are defined to allow designers to independently design sequences of FPGA configurations for the virtual hardware with pre-defined interface patterns. These patterns allow large amounts of data to be passed from one instruction to another without external memory access, thereby speeding up the virtual computation compared to prior methods. Each pattern is defined as a spatial-temporal set of locations in the FPGA logic block array. The spatial dimension specifies the two-dimensional coordinates of the memory word in the tile array. The temporal dimension defines the relative sequence number in the sequence of FPGA configurations seen by the virtual hardware.
By limiting the number of input/output patterns, a complete library of pattern transfer and commonly used computation functions is created. These library functions or instructions are available as FPGA configuration data (i.e. bitstreams), thereby allowing long sequences of virtual hardware instructions to be created without having to enumerate all possible FPGA configurations or to wait for each FPGA configuration to be generated by the FPGA place-and-route tools with data location constraints passed from one configuration to the next. In another embodiment, the user has the option of generating custom made instructions using the FPGA place-and-route tools.
Pattern manipulation instructions configure the FPGA routing resources to move the data as required. Because this routing problem is much simpler than the problem of generating a normal FPGA configuration using place-and-route tools, this task can typically be performed much faster than prior art methods, depending on the size and patterns of the data to be moved. Thus, these instructions allow extra computational power to be packed into an FPGA by relaxing the data placement constraints and moving the routing cost to the next/previous pattern transfer configuration.
In the present invention, the virtual instructions can be created or accessed by any process. Thus, providing a virtual instruction library is only one of many ways known by those skilled in the art to access virtual instructions. For example, in one embodiment, a user can access a template in a library of templates. These templates can then be customized with the required functionality, thereby building the appropriate virtual instruction. The library can also include the parameters available for customizing the template. In other embodiments, the templates can be customized by the user directly, for example by using a graphical user interface (GUI). In accordance with another embodiment of the present invention, an input/output pattern can be added to the template. In this manner, all virtual instructions can be customized to have the same input/output patterns, thereby greatly simplifying the computation task. In yet another embodiment, the input/output pattern is alternated. Finally, in a fully customized embodiment, the user can build the virtual instruction from scratch.
Assuming a computation system may have different input/output patterns, then an output pattern of a first instruction is compared to an input pattern of a second instruction. If the input and output patterns of the first and second instructions do not match, then a pattern manipulation instruction is inserted between the first and second instructions. At this point, the input and output patterns of the first and second instructions should match and the computation task can be completed. In one embodiment, the first input pattern and the first output pattern are the same. In the same embodiment, or in a separate embodiment, the second input pattern and the second output pattern are the same.
The pattern manipulation instructions, like the virtual instructions, can be created or accessed by different processes. In one process, the pattern manipulation instruction is chosen from a library of such pattern manipulation instructions. The library can be created by the user, the manufacturer, or a third party. In another process, pattern manipulation instruction is created by generating a new FPGA configuration.
Of importance, the present invention is equally applicable to any standard FPGA. In this embodiment, the data stored in the storage elements of the FPGA, such as flip-flops, is retained for the next configuration of the FPGA. In this manner, successive configurations can communicate data using the patterns of the storage elements, thereby also allowing these standard FPGAs to implement virtual instructions.
Alternatively, a standard FPGA could write out data to an external memory using a predetermined pattern of addresses. In a subsequent configuration of the FPGA, the device could read data back from this pattern of addresses in the external memory. This embodiment allows various patterns of addresses, corresponding to data, to be used in any appropriate subsequent configuration of the FPGA. In this manner, the plurality of memory planes, previously provided on the dynamically reconfigurable FPGA, can be implemented off-chip.
The method of communicating to subsequent configurations by storing data in patterns can also refer to partial configurations of FPGAs that overlap in physical space, but are sequential in time. For example, if a first partial configuration occupies a first area on the FPGA and a second partial configuration occupies a second area on the FPGA which overlaps the first area, then the first configuration can store data in a pattern of storage elements restricted to the second area. In this manner, when the second configuration is activated, the second configuration can retrieve those data values from the second area.
Of importance, pattern manipulation instructions, can also be extended to allow communication between partial configurations that do not overlap in physical space or time. If a first configuration occupies a first area on the FPGA and the second configuration occupies a second non-overlapping area on the FPGA the present invention creates a pattern manipulation instruction to occupy an area comprising both first and second areas. This instruction allows data to be conveyed from the first configuration to the second configuration both spatially and temporally.