In general, an integrated circuit such as a VLSI has placed a number of driven elements (e.g., flip-flop cells (sometimes referred to as FF)) which are to be driven synchronously in response to a clock signal. The driven elements are supplied with a clock signal through a clock signal line from a clock supplying source (source).
In this case, as shown in FIG. 47, if several ten thousands of FF 502 are connected to a single clock supplying source 501, it will be impossible for all of the elements to be directly driven by the clock supplying source due to a fan-out restriction. In this case, the term fan-out restriction means a condition which shall be satisfied when a driving element (clock supplying source 501) drives driven elements (FF502) connected to the output side of the driving element. In more concretely, if the driving element does not satisfies the following equation in its driving capability, the elements under deriving cannot be driven.
[driving capability of the driving element]>[wiring capacity from the driving element to driven elements on the next stage]+[load of the driven elements on the next stage]
As described above, if several ten thousands of FFs 502 are arranged to driven directly by a single clock supplying source 1, the sum of the wiring capacity and loads of the FFs 502 will become extremely large. For this reason, even if a clock signal has a rising edge which rises as shown in FIG. 48A upon generating from the clock signal supplying source, the clock signal will come to have a rising edge dull as shown in FIG. 48B when the clock signal has traveled the distance from the clock supplying source 501 to each of the FFs 502, with the result that the clock signal becomes incapable of driving each FF 502.
For this reason, in general, an arrangement of buffering has been made as shown in FIG. 49. That is, clock signal lines extending between the clock supplying source 501 to each of the FFs 502 are wired in a tree-like manner (clock tree is synthesized), and a plurality of stages (in the case of FIG. 49, a couple of stages are shown) of buffer elements (sometimes referred to as buffer cell) are inserted and placed in the clock signal lines. If such a buffering is effected, the fan-out restriction will be satisfied in each of the plurality of stages of the buffer elements 503 between the clock supplying source 501 to the plurality of FFs 502.
Further, in order to satisfy the fan-out restriction, taken is a measure known as a buffer-sizing in which each cell provided in a clock system is adjusted or changed in its driving capability. A cell as a target of the buffer-sizing is, in addition to a logic gate or a selector, a buffer element which is provided on a clock signal line as a buffering element as described above. The selector is utilized for selecting a clock signal when a plurality of clock signals are supplied in the circuit under designing.
According to a conventional method of buffering or buffer-sizing, when clock system nets (clock signal lines or clock tree) are synthesized upon designing a circuit, an initial logic is once reduced into a placement based on a netlist obtained by a logic design, and thereafter a portion included in the placement in which the fan-out restriction is not satisfied is subjected to a process of buffering or buffer-sizing so that the portion satisfies the fan-out restriction. If a portion of the placement reduced from the initial logic satisfies the fan-out restriction, the portion is not subjected to the process of buffering or buffer-sizing.
The above-described design scheme is disclosed in, for example, a reference entitled “A Methodology and Algorithms for Post-Placement Delay Optimization” at page 327 to 332 of Proceeding of the 31st ACM/IEEE Design Automation Conference written by Lalgudi N. Kannan, Peter R. Suaris, and Hong-Gee Fang.
Recently, an integrated circuit such as a VLSI comes to have a great number of circuit components and hence it is fabricated in a complicated fashion owing to the progress of fabrication technology. With this tendency of technology, a number of elements supplied with a clock signal on the integrated circuit tends to increase, leading to difficulty in designing a clock logic of the integrated circuit.
Further, since a gate is subjected to a microfabrication technology, the integrated circuit device is further requested to process a signal at a higher speed. Thus, it is desired to establish a technology which makes it possible to synthesize a clock tree with a skew reduced. That is, it is desired for the clock signal to reach all driven elements (such as FFs) from a clock supplying source substantially at the same time (ideally exactly at the same time). A scattering of time (of delay) it takes for the clock signal to be transmitted from the clock supplying source to each of the driven elements is known as a clock skew. As described above, as a clock frequency is increased in accordance with the request for the high speed processing, the clock skew is requested to be substantially completely eliminated.
However, as described above, according to the conventional way of buffering, the buffering is carried out only to satisfy the fan-out restriction. Therefore, to reduce the skew by arranging the clock tree in a balanced manner is not taken into account. That is, although the net is subjected to the buffering in such a manner that the driven elements (such as FF elements) are divided into groups so that each of the groups can be driven by a single buffer element, the conventional way of group dividing is effected only to satisfy the fan-out restriction. Therefore, the number of driven elements or the size of load capacity in each group is not taken into consideration. If scattering in the number of elements or load capacity for every group is large, then scattering of wire length from the final stage buffer element to the driven element in each group also becomes large, with the result that it becomes very difficult to adjust the clock skew. Accordingly, in order to satisfy the request of high speed processing, or reduction of the clock skew, which becomes more demanding in recent years, the clock distributing system for distributing the clock to each group shall be arranged in a balanced fashion so as to suppress the scattering in number of elements or the load capacity of every group.
Further, according to the conventional way of buffering, the buffering is not effected in connection with layout information (physical information). Therefore, a layout can result in that a single buffer element is obliged to drive a plurality of flip-flop elements which are remote from the buffer element in terms of physical distance. As a result, there is a fear that a skew adjustment becomes difficult upon arranging layout or that the clock signal line becomes long, which facts lead deterioration in circuit quality.
The present invention is made in view of the above aspect. Thus, a first object of the present invention is to provide a method of optimizing signal lines within a circuit, an optimizing apparatus and a recording medium having an optimizing program stored therein in which delay of signal from a signal supplying source to each of elements is optimized to positively reduce a skew, and it becomes possible to realize a circuit designing which can cope with a request of high speed processing.
Incidentally, as described above, the integrated circuit such as an LSI includes a number of flip-flop cells and buffer cells for driving these flip-flop cells. In general, a single buffer cell is forced to drive a plurality of flip-flop cells through a clock signal line. When a layout of circuit elements and wiring of clock signal lines are determined in an LSI, it is requested to shorten the clock wiring length for reducing a signal propagation delay time and relieving the signal lines from crowded wiring. Further, circuit elements which are designed to be supplied with a clock signal are requested to undergo a suitable circuit element placement so as to suppress a time difference (known as clock skew).
Now a conventional LSI layout method (circuit designing method) will be described in detail with reference to FIGS. 50 to 65.
FIG. 50 is a flowchart useful for describing the conventional LSI layout method (circuit designing method). As shown in FIG. 50, the LSI layout method consists of a logic design step S27, a clock layout step S28 and ordinary net wiring step S29 for wiring signal lines except for the clock signal lines.
The logic design step S27 is composed of step S271 of synthesizing logic and step S272 of creating a clock tree.
In step S271 of synthesizing logic, a hardware description language (HDL) described in a register transfer level (RTL) of elements constituting the LSI is formed into a logic which can be implemented into a layout. Thus, a first netlist (1) formed of modules is created. In clock tree creating step S272, a clock tree before having a buffer cell introduced into the LSI is created based on the first netlist (1).
FIG. 51 is a diagram showing one example of clock tree before having a buffer cell introduced into the integrated circuit. In FIG. 51, reference numeral 281 depicts a module, 282 to 284 small modules, 285 a clock source as a clock supplying source, and 286 to 288 flip-flop cells. Since the circuit of the stage has no buffer cell introduced thereinto, the clock source 285 directly drives the flip-flop cells 286 to 288.
Thereafter, the first netlist (1) is subjected to a buffering processing in which buffer cells are introduced into the clock tree to create a second netlist (2).
FIG. 52 is a diagram showing a clock tree within the module of the LSI having a buffer cell introduced thereinto. As shown in FIG. 52, buffer cells 291 to 296 are additionally introduced into the clock tree of FIG. 51. A number of buffer cells in each module is determined depending on the number of driven flip-flop cells. As shown in FIG. 52, since the number of flip-flop cells 286 and 287 in the small modules 282 and 283 are small, each group of flip-flop cells of the small modules is allocated with a single buffer cell 292 and 293 and driven by the same. However, since the number of flip-flop cells 288 in the small module 284 is larger than the number of flip-flop cells 288 in the small modules 282 and 283, the small module 284 is arranged as a tree structure so that the flip-flop cells in the small module 284 are driven by three buffer cells 294, 295 and 296.
Referring back to FIG. 50, the contents of the second netlist (2) is transferred to clock layout step S28.
Clock layout step S28 is composed of steps S281 to S287.
In step S281, a floorplan is acquired. The floorplan is layout information obtained from a designer of the LSI.
In step S282, the buffer cells and the flip-flop cells are subjected to an initial placement as shown in FIG. 53 based on the floorplan so that the clock wiring length becomes short as far as possible. FIG. 53 is a diagram showing a result of the initial placement of the buffer cells and the flip-flop cells in the LSI module. In FIG. 53, lines are drawn between a buffer cell 301 and each of six flip-flop cell's 310 to 315 so as to indicate that the six flip-flop cells are driven by the buffer cell.
At step S283, clock buffers are placed in a bottom-up fashion. A placement in a bottom-up fashion means that a flip-flop cell farthest from the clock source is given placement with priority. According to the scheme, buffer cells on the clock bus are placed in a bottom-up fashion based on flip-flop cell placement information so that the distance from the buffer cell 301 to each of the flip-flop cells 310 to 315 becomes equal as far as possible. As a bottom-up placement scheme, there can be a scheme in which a buffer cell is placed at the center of the placement area where flip-flop cells driven by the buffer cell are located. Alternatively, there can be a scheme in which a buffer cell is placed at the center of gravity of flip-flop cells which are driven by the buffer cell.
FIG. 54 is a diagram showing an example of placement within an LSI module in which the buffer cell 301 is placed at the center of the placement area where flip-flop cells 310 to 315 driven by the buffer cell 301 are located.
At step S284, in order to determine the restriction in wiring length between the buffer cell to each of the flip-flop cells, a placement restriction area is created for the flip-flop cells. The placement restriction area is a lozenge area having the buffer cell driving the flip-flop cells centered. FIG. 55 is a diagram showing an example in which a placement restriction area 322 having a buffer cell 321 centered. In the next step, a placement processing is retried under consideration of the placement restriction area 322. In this case, the placement of the buffer cell 321 deriving from the bottom-up placement at step S283 is fixed, the flip-flop cells is subjected to the placement restriction based on the placement restriction area created at step S284, and cells other than the clock buffer are again subjected to placement. FIGS. 56 and 57 are diagrams showing a result of retried cell placement. As shown in FIGS. 56 and 57, the buffer cell 321 is located at the center of the placement restriction area 322, and the flip-flop cells 331 to 336 are located within the placement restriction area 322. Rectangular shapes in FIG. 57 indicate flip-flop cells.
At step S286, a special arrangement of wiring is effected on a clock tree final stage net. In this case, three types of processing shown in FIGS. 58 to 60 are carried out. That is, FIG. 58 shows a setting of cluster bar information 351 to 357. FIG. 59 shows a wiring of branch lines each connecting a flip-flop cell and the cluster bar. FIG. 60 shows a route wiring connecting the driving buffer cell 321 to the cluster bars.
Thereafter, a wiring of special arrangement of the net connecting buffers to one another is effected. In this case, as shown in FIG. 61, signal lines are radiated in arbitrary directions from a buffer cell 381 to four buffer cells 282 to 385 which are driven by the buffer cell 381. If the above-described steps S282 to S287 are carried out, then the placement and the wiring of the clock tree will be accomplished so that the signal propagation delay time between a clock source to a flip-flop and the skew value are suppressed to a limit value. Thereafter, a wiring processing for an ordinary net is effected at step S29.
The above-described LSI layout method is a scheme mainly utilized when designing a gate array or an embedded array an ASIC (Application Specific Integrated Circuit). In this scheme, when a process goes to a step after a logic design stage, a clock layout and an LSI implementation design are effected without changing the netlist. According to the above-described conventional method, clock signal lines are subjected to a layout process in such a manner that the clock tree logic is not changed and the clock signal is transmitted while satisfying a condition of a signal propagation delay time and a skew value determined by a user.
However, according to the above-described conventional method, when the processing stays in step S271 in which a logic synthesis (logic design) is effected and in step S272 in which the clock tree is created, the logic design is carried out without taking the placement position of flip-flop cells and clock signal line wiring paths into consideration. For this reason, if the clock layout is effected on the logic information so as to satisfy the clock restricting condition, then wiring becomes crowded and clock wiring and signal wiring after the clock wiring will encounter difficulties which prevent the clock wiring and the signal wiring from being carried out.
The difficulties lying in the process of the clock wiring and the signal wiring will hereinafter be described in detail.
When the clock tree is created at step S272, floorplan and placement of flip-flops are not taken into consideration. Therefore, if the placement step is completed up to the stage of placing buffer cells, as shown in FIG. 62, floorplan blocks 392 and 395 on a semiconductor chip 391 can be placed so as to be remote from each other in spite of the fact that both of the floorplan blocks 392 and 395 are driven by a buffer cell 396. Similarly, floorplan blocks 393 and 394 on the semiconductor chip 391 can be placed so as to be remote from each other in spite of the fact that both of the floorplan blocks 393 and 394 are driven by a buffer cell 397. In this case, signal lines 398 to 401 of a network (hereinafter referred to as net) for driving the floorplan blocks remote from each other become long, with the result that a propagation delay time of the clock signal becomes long and wiring is crowded (first problem).
At step S285 of FIG. 50 in which placement of flip-flops is retried under consideration of placement restriction area, as shown in FIG. 63, if placement restriction areas 430 and 431 are overlaid on one another and the flip-flop cells are subjected to retry of placement, then number of flip-flop cells can be concentrated in the common area of the placement restriction areas 430 and 431 as shown in FIG. 64. Such a concentration of the flip-flop placement will cause crowded wiring (second problem). In FIGS. 63 and 64, reference numerals 410 and 420 depict buffer cells, 411 to 416 flip-flop cells driven by the buffer cell 410, 421 to 426 flip-flop cells driven by the buffer cell 420, and 430 and 431 placement restriction areas on the buffer cells 410 and 420, respectively. FIG. 64 is a diagram showing an example in which the flip-flop cells 411 to 416 are moved into the placement restriction area 430 while the flip-flop cells 421 to 426 are moved into the placement restriction area 431.
Further, if the flip-flop cells are moved from initial placement positions to the placement restriction area, then a signal line of the net which is connected to any component other than a flip-flop cell supplied with clock signal may become long, with the result that wiring except for the clock signal also become crowded (third problem). FIG. 64 contains examples of signal lines 441 to 445 which become excessively long.
Furthermore, as shown in FIG. 61, if a plurality of buffer cells are driven by a single buffer cell, wiring between the buffer cells can contain a crowded portion of wiring in signal lines between buffer cells (fourth problem). FIG. 61 shows an arrangement of a buffer cell 381 and a plurality of buffer cells 382 to 385 driven by the buffer cell 381, wherein wiring between the buffer cell 381 and each of the buffer cells 382 and 383 is crowded.
As a problem other than the problem of crowded wiring, there can be a problem of a skew value of a signal propagation delay time (fifth problem). That is, at step S285 of FIG. 50, the placement processing is retried while the placement restriction of flip-flop cells is effected, and flip-flop cells are collected, whereby the skew value is suppressed. However, a flip-flop cell, for example a flip-flop cell 371 shown in FIG. 60, can be located at a place remote from a cluster of the cell. In this case, a signal line extending to the flip-flop cell becomes long, and hence the propagation delay time of a signal reaching the cell is increased, with the result that the skew becomes large.
Further, at step S272 of FIG. 50 in which the clock tree is created, buffer cells are placed so as to satisfy the fan-out restriction. However, if a placement is effected based on the above scheme, as shown in FIG. 52, there can be formed portions of net having buffer stages from the clock source 285 to a flip-flop cell respectively different from each other, such as the module 282 and the module 284 or the module 283 and the module 284. In this case, a clock path including two stages of buffer and a clock path including three stages of buffer will have signal propagation delay times different from each other, with the result that the sew becomes large (sixth problem).
Furthermore, according to the conventional method shown in FIG. 65, at step S282 of FIG. 50 in which the initial placement is determined, flip-flop cells 451 to 454 are placed while clock paths 457 and 458 extending from clock buffer cells 455 and 456 taken into account. Therefore, wiring paths of data path 359 connecting the flip-flop cells to one another intersects one another, resulting in complicated wiring. In addition, a data path 459, and clock paths 457 and 458 become long, which causes a timing error (seventh problem).
The above-introduced first problem derives from a fact that the layout information is not taken into account when the clock tree is created.
The above-introduced second and third problems derive from facts that the placement restriction areas can be overlaid on one another, and driven cells are forcibly moved into the overlapped portion of the placement restriction areas when the placement is retried.
The above-introduced fourth problem derives from a fact that a wiring between a buffer cell for driving another cell and buffer cells which are driven by that buffer cell is arranged merely as a radiated fashion.
The above-introduced fifth problem derives from a fact that the conventional method employs a scheme that flip-flop cells are merely subjected to a placement retry within the placement restricted area.
The above-introduced sixth problem derives from a fact that if the clock tree includes modules having different buffer stages, then a clock skew causing in the path becomes large.
The above-introduced seventh problem derives from a fact that flip-flop cells are placed upon initial placement with a clock path not disregarded.
Accordingly, a second object of the present invention is to propose a method of designing a circuit and a recording medium having stored therein a program for designing a circuit in which the circuit can be designed such that a clock signal line and signal line other than the clock signal line are short, a wiring of the circuit can be prevented from being crowded, a signal propagation delay time is short, and the scattering of the signal propagation delay time can be properly suppressed.