The invention relates generally to placement of resources in a localized area of a network, and more particularly to placement of logic blocks in a local area of an integrated circuit for enhancing dedicated routing resources.
Programmable logic devices (PLDS) exist as a well-known type of integrated circuit (IC) that may be programmed by a user to perform specified logic functions. There are different types of programmable logic devices, such as programmable logic arrays (PLAs) and complex programmable logic devices (CPLDs). One type of programmable logic devices, called a field programmable gate array (FPGA), is very popular because of a superior combination of capacity, flexibility, time-to-market, and cost. An FPGA typically includes an array of configurable logic blocks (CLBS) surrounded by a ring of programmable input/output blocks (IOBs). The CLBs and IOBs are interconnected by a programmable interconnect structure. The CLBS, IOBS, and interconnect structure are typically programmed by loading a stream of configuration data (bitstream) into internal configuration memory cells that define how the CLBS, IOBs, and interconnect structure are configured. The configuration bitstream may be read from an external memory, conventionally an external integrated circuit memory EEPROM, EPROM, PROM, and the like, though other types of memory may be used. The collective states of the individual memory cells then determine the function of the FPGA.
Referring to FIG. 1, there is shown a schematic diagram of a CLB architecture 10 in accordance with prior art. Other details regarding CLB architecture 10 may be found in U.S. Pat. Nos. 6,262,597, 6,118,298 and 5,889,413. CLB 10 comprises four slices, S0, S1, S2 and S3. Each slice S0, S1, S2 and S3 includes two look-up tables (LUTs), namely, LUTs F0 and G0, F1 and G1, F2 and G2, and F3 and G3, respectively. CLB 10 may be cascaded with other CLBs, where carry data inputs 11 and 12 may be from a previous CLB stage, and carry data outputs 21 and 22 may be provided to a subsequent CLB stage. Clock signals, CLK Ø0 and CLK Ø1, are out-of-phase with respect to one another, and each such clock signal is provided to LUTs F3 and G3, F2 and G2, F1 and G1, and F0 and G0. Each slice S0, S1, S2 and S3 has data inputs and data outputs X and Y, namely, D-INs BX0 and BY0, BX1 and BY1, BX2 and BY2, and BX3 and BY3, respectively, and D-OUTs X0 and Y0, X1 and Y1, X2 and Y2, and X3 and Y3, respectively. Each LUT F3 and G3, F2 and G2, F1 and G1, and F0 and G0 receives a respective set of address signals, namely, either address signals F1, F2, F3 and F4 or address signals G1, G2, G3 and G4.
Conventionally, in configuring an integrated circuit, and in particular a CLB, a placer or packer tool (a well-known software tool for locating components or logic blocks for laying out an integrated circuit design) places two LUTS into specific locations in a slice and places four slices into a CLB defined area with little distinction between these four locations. Accordingly, dedicated connections or xe2x80x9cfast connectsxe2x80x9d were utilized through slice placement rules and practically random placement within a CLB area. By xe2x80x9cfast connectxe2x80x9d in the context of an FPGA, it is meant a dedicated routing resource or connection allowing a LUT output in a CLB to drive one or more specific LUT inputs within the same CLB. However, as is known, fast connects or other dedicated routing resources exist in integrated circuits other than FPGAs, or even more generally PLDS. Accordingly, it should be appreciated that fast connects are not limited to any particular architecture, and thus exist outside the context of PLDs.
A conventional placer rule for placement of LUTs is if one of two LUTs drives the other of the two LUTs, then such two LUTs are placed at specific locations that allow a fast connect to be used for that connection. Conventionally, locations of LUTs coupled for fast connection remain fixed for a remainder of a placer tool""s process flow. A problem emerges due to fixing such placement because not all fast connects will have the same speed. Moreover, placement of slices, and therefore LUTs, within a CLB area is insufficiently controlled to ensure that all fast connects will have optimized speed.
In actuality, speed of fast connects can vary so dramatically that critical connections, connections with difficult to meet timing targets usually with a negative connection slack or just marginally positive connection slack, for a user defined timing path associated with LUTs configured for fast connect can fail to get a xe2x80x9cfastxe2x80x9d connection. In other words, an attempt to use a fast connect to meet an important delay target for a circuit design programmed into a PLD may not completely be realized as fast connect speed may be insufficient. Speed of fast connects conventionally range from approximately 1 picoseconds (ps) to approximately 300 ps. Non-fast connect speeds conventionally are approximately 380 ps or more.
In instances where a LUT in a slice drove another LUT in another or a same slice, use of fast connects was allowed through LUT input pin swapping, or xe2x80x9cpin swapping,xe2x80x9d provided a fast connect existed for a current LUT placement. A router tool swapped pins during a post-placement routing phase. However, there are instances where pins are not swappable, such as certain LUT configurations, for example when configured as a random access memory or a shift register.
Moreover, though same source address signals may be provided to a xe2x80x9cG-xe2x80x9d LUT and an xe2x80x9cF-xe2x80x9d LUT, routing of such address signals may not be equivalent. Thus, propagation delay of each of address signals F1, F2, F3, F4, G1, G2, G3 and G4 may be different within a slice S0, S1, S2 and S3 and between such slices within a CLB.
Accordingly, it would be desirable and useful to improve placement of slices within a CLB area to enhance the number of fast connects available. Accordingly, it would be desirable and useful to rank fast connects according to speed to facilitate utilization of faster fast connects over slower fast connects, and more particularly to facilitate taking into account speed differences of fast connects for addressing design timing constraints.
An aspect of the invention is a process that facilitates improving or maximizing use of fast connects by changing placement of logic within a CLB. Another aspect of the invention is a means for taking into account speed differences between different fast connects. Additionally, user-timing constraints may be taken into consideration in another aspect of the invention.
An aspect of the invention is a method for improving network performance where a local area of the network is obtained. Generated are signal propagation placement options for dedicated resources of the network within the local area obtained, which are then scored at least partially responsive to respective delay targets. The dedicated resources are then placed responsive to a score of one of the signal propagation placement options.
Another aspect of the invention is a method for improving placement of dedicated resources of a network of circuit blocks. A local area, where the dedicated resources are located, is obtained, and placement options are generated for placing the circuit blocks in the local area. The placement options are scored, where the scoring includes costing the placement options at least partially responsive to delays of the dedicated resources. The circuit blocks are placed in the local area responsive to the placement options costed.
Another aspect of the invention is an integrated circuit having a local area comprising dedicated resources and circuit blocks coupled one to another and positioned for improved dedicated resource usage by: obtaining the local area; generating circuit-block level placement options; scoring the circuit-block level placement options at least partially responsive to delay targets; and placing the circuit blocks responsive to a score of one of the circuit-block level placement options.