1. Field of the Invention
The invention relates to programmable integrated circuits (ICs). More particularly, the invention relates to parametric logic modules for implementing designs in programmable ICs, and software tools and methods for creating such modules.
2. Description of the Background Art
Programmable ICs are a well-known type of digital integrated circuit that may be programmed by a user to perform specified logic functions. One type of programmable IC, the field programmable gate array (FPGA), typically includes an array of configurable logic blocks (CLBs) surrounded by a ring of programmable input/output blocks (IOBs). The CLBs and IOBs are interconnected by a programmable interconnect structure. The CLBs, IOBs, and interconnect structure are typically programmed by loading a stream of configuration data (bitstream) into internal configuration memory cells that define how the CLBs, IOBs, and interconnect structure are configured. The configuration data may be read from memory (e.g., an external PROM) or written into the FPGA by an external device. The collective states of the individual memory cells then determine the function of the FPGA.
One such FPGA, the Xilinx XC4000(trademark) Series FPGA, is described in detail in pages 4-5 through 4-78 of the Xilinx 1996 Data Book entitled xe2x80x9cThe Programmable Logic Data Bookxe2x80x9d (hereinafter referred to as xe2x80x9cthe Xilinx Data Bookxe2x80x9d), published September, 1996, available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124, which pages are incorporated herein by reference. (Xilinx, Inc., owner of the copyright, has no objection to copying these and other pages referenced herein but otherwise reserves all copyright rights whatsoever.)
As FPGA designs increase in complexity, they reach a point at which the designer cannot deal with the entire design at the gate level. Where once a typical FPGA design comprised perhaps 5,000 gates, FPGA designs with 20,000 gates are now common, and FPGAs supporting over 100,000 gates will soon be available. To deal with this complexity, circuits are typically partitioned into smaller circuits that are more easily handled. Often, these smaller circuits are divided into yet smaller circuits, imposing on the design a multi-level hierarchy of logical blocks. The imposition of hierarchy makes it possible to implement designs of a size and complexity that would otherwise be unmanageable.
True hierarchical design is difficult to achieve for FPGAs using currently-available software. Designs can be entered hierarchically (e.g., via schematic entry or Hardware Description Languages (HDLs)), but mapping, placement and routing software typically xe2x80x9cflattensxe2x80x9d the design. It is desirable to retain the hierarchy as long as possible through the mapping, placement, and routing stages of implementing the design. The advantages of maintaining a hierarchy include faster software (because fewer objects at a time require manipulation), ease of changing the design (because only a discrete portion of the total logic has to be changed), and ease of doing incremental design (retaining a portion of a design while remapping, replacing, and rerouting only the part of the design that has been changed).
One result of the xe2x80x9chierarchical advantagexe2x80x9d is the development of xe2x80x9cmodule librariesxe2x80x9d, libraries of predeveloped blocks of logic that can be included in a hierarchy of logical blocks. A higher-level block incorporating (instantiating) a second module is called a xe2x80x9cparentxe2x80x9d of the instantiated module. The instantiated module is called a sub-module or xe2x80x9cchildxe2x80x9d of the parent. Such library modules include, for example, adders, multipliers, filters, and other arithmetic and DSP functions from which complex designs can be readily constructed. The use of predeveloped logic blocks permits faster design cycles by eliminating the redesign of duplicated circuits. Further, such blocks are typically well tested, thereby making it easier to develop a reliable complex design.
To offer the best possible performance, some library modules have a fixed size and shape with relative location restrictions on each element. One such module type (now obsolete) was the xe2x80x9chard macroxe2x80x9d from Xilinx, Inc. Hard macros are described in the xe2x80x9cXC4000 Family Hard Macro Style Guidexe2x80x9d, published Sep. 3, 1991 and available from Xilinx, Inc., which is incorporated herein by reference in its entirety. A hard macro did not require schematics; instead, it included a schematic symbol that was used to include the macro in a customer design, and a netlist referenced by the schematic symbol and representing the macro circuitry. The netlist was encrypted and sent to customers in binary format, which made it difficult to edit or reverse-engineer the netlist. (A xe2x80x9cnetlistxe2x80x9d is a description of a circuit comprising a list of lower-level circuit elements or gates and the connections (nets) between the outputs and inputs thereof.)
One disadvantage of the hard macro format was that the area of the FPGA encompassed by the hard macro was totally dedicated to the contents of the macro. A customer could not place additional logic in a CLB within the area, or access any signal inside the area unless the signal had an output port defined as part of the hard macro. Further, the area of the FPGA encompassed by the hard macro had to be rectangular. If the logic fit best, or resulted in the fastest performance, in a non-rectangular area, the extra CLBs required to make the area rectangular were wasted. Hard macros did not include routing information.
Another type of module having a fixed size and shape is the Relationally Placed Macro (RPM) from Xilinx, Inc. RPMs are described in pages 4-96 and 4-97 of the xe2x80x9cLibraries Guidexe2x80x9d (hereinafter referred to as the xe2x80x9cXilinx Libraries Guidexe2x80x9d), published October 1995 and available from Xilinx, Inc., which pages are incorporated herein by reference. An RPM is a schematic that includes constraints defining the order and structure of the underlying circuits. The location of each element within the RPM is defined relative to other elements in the RPM, regardless of the eventual placement of the RPM in the overall design. For example, an RPM might contain 8 flip-flops constrained to be placed into four CLBs in a vertical column. The column of four CLBs can then be placed anywhere in the FPGA.
Relative CLB locations within an RPM are specified using a Relative Location Constraint called xe2x80x9cRLOCxe2x80x9d. RLOC constraints are described in detail in pages 4-71 through 4-95 of the Xilinx Libraries Guide, which pages are incorporated herein by reference. Elements having an RLOC value of R0C0 are located in a given CLB corresponding to the (0,0) coordinate location. The next CLB xe2x80x9cbelowxe2x80x9d the (0,0) CLB is designated as R1C0, corresponding to the (0,1) coordinate location. Although the RPM has a rigid size and shape, other logic can be placed within the borders of the RPM. RPMs, like hard macros, do not include routing information.
A hard macro or RPM implementation of a logic module represents a single fixed circuit targeting a specific FPGA architecture. To accommodate even a slight difference in logic or a different FPGA architecture, a new RPM must be created. Therefore, libraries of RPMs are typically large and limited in scope.
Some flexibility has been provided by creating a different type of library module called a parametric module (i.e., a module having one or more associated variable values). One such type of parametric module is described in pages 1xe2x80x941 to 2-14 of the xe2x80x9cX-BLOX User Guidexe2x80x9d, published April, 1994 and available from Xilinx, Inc., (hereinafter the xe2x80x9cX-BLOX User Guidexe2x80x9d), which pages are incorporated herein by reference. The X-BLOX software tool includes a collection of library modules from Xilinx, Inc. X-BLOX modules comprise schematic symbols that can be added to a schematic representation of an FPGA design, and then parameterized to specify such variables as bit width, initial values, and so forth. The schematic including one or more X-BLOX modules is then translated into a netlist, and the netlist includes instantiations of the X-BLOX modules. Translation software then implements the X-BLOX module as lower-level circuit elements targeted for the Xilinx XC4000 Series FPGA, and includes these elements in the netlist. The netlist is then implemented in the FPGA by standard mapping, placement, and routing software which generates a configuration bitstream.
X-BLOX modules, although customizable, have a predefined shape and size based on the user-defined parameter values. For example, a counter module (called xe2x80x9cCOUNTERxe2x80x9d) is typically implemented as a column of CLBs, wherein the number of CLBs involved is dependent on the value of the parameter xe2x80x9cCOUNT_TOxe2x80x9d (which defines the maximum value of the counter). The COUNTER module is described on pages 4-36 to 4-46 of the X-BLOX User Guide, which pages are incorporated herein by reference. While the size of the counter varies according to the value of the parameter COUNT_TO, for a given parameter value the X-BLOX module has a fixed size and shape.
A newer generation of the X-BLOX product is the LogiBLOX(trademark) tool, also from Xilinx, Inc. A LogiBLOX module has similar capabilities to an X-BLOX module with some accompanying advantages, such as the availability of different logical implementations for a single module. For example, by setting a parameter in a graphical user interface, a user can specify that a multiplexer be implemented using tristate buffers driving horizontal interconnect lines, rather than using the default CLB implementation. The LogiBLOX product is described on pages 4-3 and 4-4 of the xe2x80x9cCORE Solutions Data Bookxe2x80x9d, copyright 1997, and available from Xilinx, Inc., (hereinafter the xe2x80x9cCORE Solutions Data Bookxe2x80x9d), which pages are incorporated herein by reference. Another product comprising parametric modules is the LogiCORE(trademark) library and software tool from Xilinx, Inc., which is described on pages 2-3 to 2-4 of the CORE Solutions Data Book, which pages are incorporated herein by reference. One LogiCORE product is the LogiCORE PCI interface module, described on pages 2-5 to 2-20 of the CORE Solutions Data Book, which pages are incorporated herein by reference. The LogiCORE PCI interface product includes a graphical user interface which allows the user to customize a xe2x80x9cheaderxe2x80x9d portion of the PCI design. The data entered by the user comprises memory initialization data that is stored in a predefined register portion of the PCI interface design. The data does not otherwise alter the logic of the PCI interface circuit implemented by the module, other than to adapt the module to the target application.
Another LogiCORE product is a set of DSP CORE modules, described on pages 2-21 through 2-91 of the CORE Solutions Data Book, which pages are incorporated herein by reference. The DSP CORE modules are parameterizable VHDL modules. (VHDL is one well-known type of HDL.)
Another parameterizable logic block is described by Karchmer et al in UK Patent Publication GB 2306728 A, published May 7, 1997, entitled xe2x80x9cProgrammable Logic Array Design Using Parameterized Logic Modulesxe2x80x9d (hereinafter xe2x80x9cKarchmerxe2x80x9d), which is incorporated herein by reference. Karchmer describes a programmable logic module in which parameter values are passed down through a module hierarchy from a parent module to child modules placed within the parent.
It is clear that X-BLOX, LogiBLOX, and LogiCORE modules are optimized as independent blocks of logic, not in the context of the entire design. It would be desirable for library modules to be implemented in a manner leading to the optimal complete design, rather than to a conglomeration of separately optimized circuits.
Morphologic, Inc., of Bedford, N.H., reports that they have devised a type of library module which is implemented based on both user-supplied parameters (including bit widths, setup time, clock to Q time, known sizes, shapes, and latencies) and dynamically generated parameters. These dynamically generated parameters include the target FPGA architecture and the type of tile (e.g., CLB or IOB) present in the portion of the FPGA to be occupied by the logic in the module. For example, when a Morphologic register module is targeted, using the Morphologic Floorplanner tool, to a Xilinx XC4000 Series FPGA, the module may be implemented in either a CLB or an IOB. When the Morphologic module is moved within the interactive floorplanner from a CLB area to an IOB area, the register is implemented using IOB input flip-flops rather than CLB memory elements. When moved from one FPGA to another, and the FPGAs have different architectures (for example, the module is moved from a Lucent Technologies ORCA FPGA to a Xilinx XC4000 Series FPGA) the Morphologic module reportedly changes from an implementation directed to the first FPGA architecture to another implementation directed to the second FPGA architecture. When moved from place to place on a single FPGA within the Morphologic floorplanner, the Morphologic module can reportedly change its shape by rearranging sub-modules within the module, if in the new location it would overlap with a previously placed module.
Although a Morphologic module can reportedly adapt to the location in which it is placed during an interactive (i.e., manual) floorplanning step, it would also be desirable to have the ability to optimize a module not only in the context of the shapes of neighboring modules, but also based on the logical interconnections that must be formed between modules. It would be further desirable to have the ability to adapt the shape of a module at the time the design is mapped, placed, and routed, rather than adapting the shape at an earlier time (e.g., in the Morphologic interactive floorplanner or using other prior art methods) and then xe2x80x9cfreezingxe2x80x9d the module so that it is unable to respond to later changes. It would further be desirable to be able to adapt not just the physical structure (the shape), or the target FPGA architecture, but also the logical structure (the way the circuit is logically implemented) in response to the requirements of other modules.
When hierarchy is imposed using hard macros or RPMs, locally optimized solutions are created, because placement of elements within modules is without reference to spatial or timing requirements of neighboring modules. Therefore, it would be desirable to have the ability to alter a module at the time the design is mapped, placed, and routed, based on the various requirements of the complete design.
Self Implementing Modules
The invention provides parametric modules called Self Implementing Modules (SIMs) for use in programmable logic devices such as FPGAS. The invention further provides tools and methods for generating and using SIMs. SIMs implement themselves at the time the design is elaborated, i.e., when the SIM code (now in the form of compiled code) is executed, usually when the design is mapped, placed, and routed in a specific FPGA. (Modules implemented in a programming language are first compiled into executable code, then, in a second step, executed or elaborated to produce a netlist and, optionally, placement and/or routing information.)
SIMs target a specified FPGA according to specified parameters that may, for example, include the required timing, data width, number of taps for a FIR filter, and so forth. SIMs are called xe2x80x9cself implementingxe2x80x9d because they encapsulate much of their own implementation information, including mapping, placement, and (optionally) routing information. Therefore, implementing a SIM-based design is significantly faster than with traditional modules, because much of the implementation is already complete and incorporated in the SIM. SIMs are relocatable anywhere in an FPGA CLB array. They are also well-suited for use in reconfigurable computing systems, wherein most of a programmed IC remains operational while a portion of the IC is reconfigured. One or more SIMs can be regenerated with new parameter values and downloaded to such a reconfigurable IC. In one embodiment, the SIMs are not parameterizable.
SIMs can be hierarchical, i.e., they can include other SIMs, thereby forming a logical hierarchy. In addition, SIMs can include a separate physical hierarchy which is imposed in addition to the logical hierarchy. SIMs therefore offer the advantages of hierarchical design. In some embodiments, SIM parameters are passed down through the hierarchy from a given xe2x80x9cparentxe2x80x9d SIM to its sub-SIMs, or xe2x80x9cchildrenxe2x80x9d.
In addition to storing logical and physical structural information (e.g., hierarchical groupings) a SIM has an embedded object (or a reference to such an object) called a xe2x80x9cPlannerxe2x80x9d. A Planner object is a floorplanner that algorithmically computes the physical locations of the SIM""s constituent sub-SIMs. A floorplanner""s algorithm can range from the simple algorithm xe2x80x9cassign locations according to a single pre-defined footprintxe2x80x9d, to complex algorithms such as xe2x80x9cperform simulated annealing on the constituent SIMsxe2x80x9d. A SIM with a context-sensitive floorplanner algorithm is able to adapt its placement to its environment by way of mutable shape and/or I/O port locations.
In addition to optimizing the physical structure, the logical structure of the SIM can also be optimized. SIMs within a design can interact with each other to change either or both of the physical and the logical structures of each SIM to reach the optimal result for the design comprising the entire group of SIMs. For example, in a SIM representing a multiplier, a chosen algorithmic implementation of the multiplier can depend on the size of the area available for the SIM (based on feedback from other SIMs) and the permissible delay through the SIM (based on the amount of time allotted to the SIM from the total available time for the complete design). When timing considerations are paramount, a pipelined multiplier implementation can be chosen. When area is the primary consideration, a non-pipelined implementation results due to the area savings of not using register elements. The shape of the implemented SIM is in each case tailored to the shape of the available space, taking into account the needs of the other SIMs in the design.
Because SIMs are preferably implemented using object-oriented software, they are easy both to use and to design. The computer code devices in a SIM may, however, be any interpreted or executable code mechanism, such as scripts, interpreters, dynamic link libraries, Java(trademark) classes, and complete executable programs. (Java is a trademark of Sun Microsystems, Inc.) In one embodiment, all information for the SIM is included in a single Java object, which may reference other Java objects.
SIMs can be transferred from the SIM designer (the person who writes the code) to a SIM user (the person designing the circuit using the SIM) in various ways, such as over a data communications link or stored on a data storage medium. (The term xe2x80x9cdata communications linkxe2x80x9d as used herein includes but is not limited to the internet, intranets, Wide Area Networks (WANs), Local Area Networks (LANs), and transducer links such as those using Modulator-Demodulators (modems). The term xe2x80x9cinternetxe2x80x9d as used herein refers to a wide area data communications network, typically accessible by any user having appropriate software. The term xe2x80x9cintranetxe2x80x9d as used herein refers to a data communications network similar to an internet but typically having access restricted to a specific group of individuals, organizations, or computers. The term xe2x80x9cdata storage mediumxe2x80x9d as used herein denotes all computer-readable media such as compact disks, hard disks, floppy disks, tape, magneto-optical disks, PROMs (EPROM, EEPROM, Flash EPROM, etc.), DRAMs, SRAMs, and so forth.
SIMs are easily delivered over a data communications link such as the internet. Therefore, in one embodiment a SIM is protected from duplication or alteration by providing an encrypted SIM that requires a key to decrypt the SIM, using well-known techniques.
Modules Parameterized by Expressions
SIMs are also advantageously adaptable to parametric inputs. In one embodiment, the SIM parameters may be symbolic expressions, which may comprise strings or string expressions, logical (Boolean) expressions (e.g., x AND y, x OR y, x less than 1), or arithmetic expressions (e.g., x+y), or a combination of these data types (e.g., modulename=basename+xe2x80x9c.xe2x80x9d+((x less than 1) AND (y+1 less than 5xc3x97)), where xe2x80x9cmodulenamexe2x80x9d is a string resulting from a combination of string, arithmetic and logical expressions). (The term xe2x80x9cstring expressionsxe2x80x9d as used herein means any expression including a character string; therefore every string is also a simple string expression.)
SIMs may have symbolic expressions as parameters; the variables in the expressions parameterizing a SIM are also parameters of the SIM. The variables may also be parameters of the xe2x80x9cparentxe2x80x9d of the SIM, passed down through the hierarchy to the child SIM.
Parametric expressions are interpreted (parsed and evaluated) at the time the SIM is elaborated. The use of parametric expressions interpreted at elaboration time allows dynamic inheritance (i.e., parameter values specified at elaboration time are passed downward through the SIM hierarchy) and synthesis of actual parameter values, rather than the static value inheritance (i.e., parameter values must be specified at compile time to pass downward through a hierarchy) described by Karchmer and commonly found in programming languages such as C++ and Java.
Including Placement Information in SIMs
Each SIM may include or reference its own implementation information, including physical layout information in the form of a floorplanner. If a SIM does not include or reference a floorplanner, its physical layout is computed by the floorplanner of one of its ancestors in the physical hierarchy (i.e., the SIM""s parent or a parent of its parent, etc.). In one embodiment, each floorplanner is implemented as a Java object.
Floorplanners can vary widely in complexity. One simple type of floorplanner chooses from among a plurality of available precomputed placements, depending on the area or timing constraints imposed on the floorplanner. Precomputed placements for this floorplanner could be implemented as a list of RLOC constraints corresponding to the elements in the SIM and designating the relative positions of the elements. Precomputed placements for a given SIM may include, for example, one columnar organization (implemented as a column of elements), one row organization (implemented as a row of elements), one or more rectangular shapes, and so forth. The precomputed placements may all apply to the same logical implementation, or a SIM may include two or more logical implementations for the circuit, and different precomputed placements may apply to different logical implementations.
More sophisticated floorplanners implement more versatile placement algorithms. Such placement algorithms might include, for example: a linear ordering algorithm that places datapath logic bitwise in a regular linear pattern; a rectangular mesh algorithm that implements memory in a grid pattern in distributed RAM; a columnar algorithm for counters and other arithmetic logic; or a simulated annealing algorithm for random logic such as control logic. Therefore, in a design including more than one SIM, the design can include two or more SIMs (at the same or different levels of hierarchy) using different placement algorithms. The design as a whole can therefore utilize a non-uniform global placement strategy, a technique that is not practical using prior art methods.
In one embodiment, the user can specify a particular floorplanner to be used for each SIM. The floorplanner can, for example, be specified by attaching a floorplanner object or a parameter to the SIM. In another embodiment, the user specifies a physical area of the target device (e.g., in an interactive floorplanner) where a given floorplanner is to be used, and identifies SIMs to be placed within that area. SIMs placed within that area then automatically use the floorplanner assigned to that area. In another embodiment, a SIM has a default floorplanner that is used if no other floorplanner is specified for the SIM.
In one embodiment, the user can specify a plurality of floorplanners to be applied to different portions of a single FPGA. Where more than one floorplanner is present in a design, the floorplanners may be active simultaneously, and may communicate with each other to implement the SIMs in a fashion that leads to the most desirable overall solution. The most desirable solution is determined by the user, who specifies timing and spatial requirements using known methods such as attaching parameters to a module or setting software options. The ability to run multiple floorplanners simultaneously speeds up the implementation process, due to the parallel processing of the placement task. In other words, this embodiment applies the well-known xe2x80x9cdivide-and-conquerxe2x80x9d strategy to physical layout of an FPGA design.
This aspect of the invention is also useful in that a user has the option of writing his or her own floorplanner, thereby implementing his or her own placement algorithm, instead of (or in addition to) specifying one of an existing set of floorplanners.
In one embodiment, the floorplanner can also receive placement information (e.g., from a constraints file or an interactive floorplanner) specifying areas of the target device that are xe2x80x9coff limitsxe2x80x9d to the floorplanner. This ability is useful, for example, when certain portions of a programmable device have been tested and found defective. By specifying and avoiding these defective areas of the device, the rest of the device can still be utilized. The same methods can be used to reserve areas of the target device to be occupied by other modules or elements. Non-SIM elements are easily integrated into a SIM environment using a xe2x80x9cblack boxxe2x80x9d interface.
Routing by Abutment
The invention also provides a method that can be used to include routing information in SIMs. According to this method, routing information is specified in such a manner that the routing between modules, however parameterized and elaborated, occurs by abutment of the modules.
In one embodiment, a SIM automatically places and interconnects child SIMs in a mesh pattern. The mesh is a 2-dimensional object corresponding, for example, to an array of CLBs on an FPGA. In essence, this embodiment allows a SIM to reserve routing resources on a target device (e.g., an FPGA), and allocate these resources to its child SIMs. Using a defined protocol, each child SIM can request and reserve routing resources, as well as placement resources (e.g., flip-flops and function generators), through the parent SIM. The routing resources are not limited to local or nearest neighbor routing.