1. Field of the Invention
The present invention relates to a layout design method of a semiconductor integrated circuit device with low power consumption, in particular, to a layout method of a semiconductor integrated circuit device having a gated clock circuit.
2. Description of Related Arts
Semiconductor integrated circuit devices, which have recently been increasing in scale and progressing to enable faster speeds, are generally designed as a clock synchronization circuit, of which the power consumption has been increased at the clock signal part. As a method which is effective to reduce the power consumption of the clock signal part, a gated clock circuit method has been proposed.
As shown in FIG. 33A, a conventional clock circuit has a flip-flop FF connected directly to a clock source CS. Contrary to this, as shown in FIG. 33B, a gated clock circuit has a gated cell (hereinafter referred to as a xe2x80x9cgated cellxe2x80x9d) GC which has, as an input, at least a control signal for controlling the supply of a clock signal (hereinafter referred to as an xe2x80x9cenabling signalxe2x80x9d) and a clock signal, between the clock source CS and the flip-flop FF, for the purpose of stopping the supply of the clock signal to the flip-flop FF which doesn""t operate under intended conditions. And the number of the flip-flops FF which can be controlled by each gated cell GC of the gated clock circuit becomes inconstant due to differences depending on the circuit. Here, the gated circuit GX is a circuit where the gated cell GC and the flip-flop FF connected to this gated cell GC are integrated.
On the other hand, in a semiconductor integrated circuit device for clock synchronization, in the case that the delay time difference of clock signals between flip-flops to which the clock signals are supplied (hereinafter referred to as xe2x80x9cskewxe2x80x9d) is large, the problem occurs that the semiconductor integrated circuit device doesn""t operate or operates incorrectly.
Therefore, in a conventional clock circuit as shown in FIG. 33A, a layout method of a semiconductor integrated circuit device referred to as a xe2x80x9cclock tree synthesis systemxe2x80x9d (hereinafter abbreviated as xe2x80x9cCTSxe2x80x9d) is widely used in order to generally reduce the skew of the clock signal. For example, as a layout method of the semiconductor integrated circuit device of a xe2x80x9cclock tree synthesis system,xe2x80x9d xe2x80x9cAn Exact Zero-Skew Clock Routing Algorithmxe2x80x9d (IEEE Trans. Computer-Aided Design, vol. 12 no. 2, pp. 242-249, February 1993) is proposed.
A layout method of the semiconductor integrated circuit device of a xe2x80x9cclock tree synthesis system,xe2x80x9d is described with respect to FIG. 34A showing the placement result of the clock circuit in FIG. 33A.
The layout method of the semiconductor integrated circuit device of the xe2x80x9cclock tree synthesis systemxe2x80x9d has the purpose of implementing a circuit wherein the delay time from the clock source CS to each flip-flop FF becomes minimum and the skew becomes minimum. Therefore, based on the placement result of the flip-flop FF belonging to a clock net, a clustering is performed for division into a plurality of clusters CL surrounded by thin lines in FIG. 34B by utilizing an evaluation function wherein the sum of xe2x80x9can input capacitance of the flip-flopxe2x80x9d forming each cluster and xe2x80x9ca wire capacitance between flip-flops estimated according to the routing algorithmxe2x80x9d is uniform and the sum of the entire capacitance of clusters becomes minimum. Then a buffer cell AGO1 is inserted, which is a cell that is driven without changing its logic, into a position, for example the center or the center of the gravity of the cluster CL, where the load capacitance (including the wire capacitance) of each flip-flop FF belonging to the cluster CL, for the purpose of reducing the delay time of the clock and the skew.
Clustering is carried out, for example, as follows. Initial clusters are formed of a plurality of clusters divided by repeating the division processing into halves with respect to the flip-flop to which a clock signal should be supplied in the same way as to a conventional cluster. Next, two arbitrary clusters are selected from the initial two clusters and one arbitrary cell belonging to each of the selected clusters is selected, respectively, and interchanged with each other. In the case that the clusters resulting from the interchange satisfy the evaluation function where the sum of xe2x80x9can input capacitance of the flip-flopxe2x80x9d forming each cluster and xe2x80x9ca wire capacitance between flip-flops estimated according to the wire algorithmxe2x80x9d is uniform and the sum of the capacities of the whole cluster become minimum, the condition where the interchange of the cells has been carried out is maintained, while in the case that the evaluation function is not satisfied, the cells are returned to the original clusters. The desired clusters can be gained by sufficiently repeating the process of clustering by using simulated annealing or the like in order to avoid the influence, or the minimum solution, of the initial clusters. The placement result after clustering is shown in FIG. 34B.
Next, clustering and buffer insertion processing are carried out repeatedly in the same way as above to the inserted buffer cells. The placement result of carrying out clustering to the inserted buffer cells AGO1 is shown in FIG. 34C. The reference symbol AGO2 denotes a buffer cell inserted at the time of clustering of the buffer cell AGO1.
A hierarchical tree referred to as a clock tree is generated according to the above processes. A clock circuit gained by performing a layout method of the semiconductor integrated circuit device of the xe2x80x9cclock tree systemxe2x80x9d to the clock circuit in FIG. 33A is shown in FIG. 35.
Then after routing is performed to the generated clock tree so that the wire length for each cluster becomes uniform, the entire routing except for the clock tree is completed according to the net list.
By performing the clustering processing and the routing processing described above the skew of the clock signals from the clock source CS to each of the flip-flops FF is reduced.
In the case that the xe2x80x9cclock tree systemxe2x80x9d layout method of the semiconductor integrated circuit device according to the prior art is applied to the gated clock circuit, however, the skew between respective flip-flops cannot be reduced from the clock source via the gated cell though the skew between respective, gated cells can be reduced from the clock source.
Therefore, as a method for reducing the skew of the clock signal of the gated clock circuit, a method disclosed in, for example, the Japanese unexamined patent publication H10 (1998)-308450 or the Japanese unexamined patent publication H11 (1999)-119853 has been proposed. In comparison to the circuit of FIG. 36 showing a gated clock circuit, a circuit implementing a method in the Japanese patent publication H10 (1998)-308456 is shown in FIG. 7 while the result of clock tree insertion is shown in FIG. 38. The reference symbols En1 and En2 show enabling signals which are added to the gated cell GC in FIGS. 36, 37 and 38. The reference symbol IC denotes an inverter cell.
In the method of the Japanese unexamined patent publication H10 (1998)-308450, a placement region for a circuit GX (hereinafter referred to as a xe2x80x9cgated circuitxe2x80x9d) comprising each gated cell GC of the gated clock circuit and a flip-flop FF connected to each gated cell GC are designated for the placement where gated cells GC forming the gated circuit and the flip-flop FF are united in one place. Next, for each gated circuit GX, clustering is carried out so that the load capacities of the whole cluster become uniform in the same way as the clustering for a layout method of the semiconductor integrated circuit device of the conventional xe2x80x9cclock tree systemxe2x80x9d for the division into a plurality of clusters. Then the same number of gated cells GC as the number of clusters gained through division are inserted into the geometric center of each cluster. The inserted gated cell GC is connected to the same enabling signal that the gated cell GC is connected to before the gated circuit is divided. At this stage, the delay value from the gated cell GC to each flip-flop FF becomes uniform. Next, routing is carried out, to a part ranging from the clock source CS to each gated cell GC, by generating a hierarchical tree in the same way as the xe2x80x9cclock tree systemxe2x80x9d of the layout method of the semiconductor integrated circuit device according to the above described prior art.
Through the above processing the skew of the clock signal from the clock source CS via the gated cell GC to each flip-flop FF is reduced.
After implementing the method of the Japanese unexamined patent publication H10 (1998)-308450 the circuit has the following characteristics. That is to say, a buffer cell (an inverter cell IC) inserted into the same stage of the clock tree and the gated cell GC have the same logic and the same drive ability respectively. The gated cell GC is inserted only between the buffer cell (inverter cell IC) at the final stage of the clock tree and the flip-flop FF.
A circuit which has implemented the method of the Japanese unexamined patent publication H11 (1999)-119853 to the circuit of FIG. 36 showing the gated clock circuit is shown in FIG. 39 and the result of the clock tree insertion is shown in FIG. 40.
According to a method of the Japanese unexamined patent publication H11 (1999)-119853, clustering is first carried out so that the load capacitance between clusters becomes uniform and the sum for the load capacitors of the clusters becomes minimum in the same way as the clustering in the xe2x80x9cclock tree systemxe2x80x9d of the layout method of the semiconductor integrated circuit device according to a prior art for each gated circuit for the division into a plurality of clusters, and the same number of enabling buffer cells BC1 as the number of clusters gained through the division are inserted into the geometric center for each cluster. The enabling buffer cells BC1 are the dedicated cells used in the method of the Japanese unexamined patent publication H11 (1999)-119853 which has the same logic structure as that of the gated cell in the gated circuit before division.
At this time the inserted enabling buffer cells BC1 are respectively connected to the same enabling system as that of the gated cells to which the gated circuit is connected before division.
At this stage, the delay time from each enabling buffer cell BC1 to flip-flop FF becomes uniform, therefore, routing is carried out by producing a hierarchical tree in the same way as the xe2x80x9cclock tree systemxe2x80x9d of the layer method of the semiconductor integrated circuit device according to the above described prior art for each enabling buffer cell BC1 from the clock source CS. By doing this, the skew of the clock signal from the clock source CS via the gated cell GC to each flip-flop FF can be reduced. The reference symbol BC2 is an enabling buffer cell inserted at the time when the tree is generated.
At the time of the clock tree generation, according to a method of the Japanese unexamined patent publication H11 (1999)-119853, an enabling buffer cell is inserted instead of the buffer cell and other input terminals, then the ones connected to the clock signals among the input terminals of the enabling buffer cells, are fixed to the power source net.
Next, the connection of the input terminal of the enabling buffer cell BC2 inserted into a cluster comprising only an enabling buffer cell BC1, which has the same input, and the connection of the input terminals of the enabling buffer cell BC1 belonging to the cluster, are switched.
By carrying out the above described processing the skew of the clock signal from the clock source via the gated cell to each flip-flop is reduced and the power supply consumed at the gated circuit is reduced.
After implementing the method of the Japanese unexamined patent publication H11 (1999)-119853 the circuit has the following characteristics. That is to say, the enabling buffer cells that are inserted into the same stage of the clock tree are, respectively, the cells with the same logic and the same drive ability. There is a process for interchanging the connection parts of the enabling buffer cells and the enabling signals, wherein a dedicated cell, which is an enabling buffer cell, is employed.
According to a conventional layout method of the gated clock circuit shown in the Japanese unexamined patent publication H10 (1998)-308450 and the Japanese un examined patent publication H11 (1999)-119853, clustering should be carried out in accordance with a cluster of which the load capacitance when driving with the gated cell becomes the minimum, that is to say, the load capacitance value of the gated circuit in order to reduce the skew because a gated cell, with the same logic and the same drive ability, is inserted.
Accordingly the division number of the gated circuit increases because of the inconsistency of the number of flip-flops in the gated circuit, and since the number of stages of the clock tree for the gated clock circuit and the number of inserted buffers, as well as the number of gated cells, increase, the delay value or the skew increase from the clock source to the flip-flop.
In the method of the Japanese unexamined patent publication H11 (1998)-308450 a placement region is designated for each gated circuit and there occurs an area increase and timing limit violation of the logic circuit.
In addition since the gated cell is inserted only between the buffer cell of the last stage of the clock tree and the flip-flop there occurs the problem that the power supply is always consumed since the buffer cell and the gated cell from the clock source to the flip flop always operate even when an enabling signal occurs for stopping the gated cell.
On the other hand, in the method of the Japanese unexamined patent publication H11 (1999)-119853, an enabling buffer cell, which is a dedicated cell, is necessary. The enabling buffer cell has a circuit structure where at least a gated cell and a buffer cell are combined, therefore, power consumption increases compared to a method where a buffer cell is used for the clock tree.
According to the method of the Japanese unexamined patent publication H11 (1999)-119853, after division of each gated circuit, clusters are formed in the same way as the above described conventional xe2x80x9cclock tree systemxe2x80x9d of the layout method of the semiconductor integrated circuit device for the clock net from the clock source to each gated cell after division. In a conventional clustering method, gated cells belonging to the clock net are all treated equally, irrespective of the kinds of the enabling signal, which puts a priority on the cluster where the sum of the xe2x80x9cinput capacitance of the flip-flopxe2x80x9d forming each cluster and the xe2x80x9cwire capacitance between flip-flops estimated according to the wire algorithmxe2x80x9d becomes uniform. Therefore, it is difficult to generate clusters comprising only the gated cells having the same input. And it difficult for the clusters comprising only the gated cells having the same input to be generated unless the placement range is designated for the flip-flop belonging to each gated circuit and unless a gated cell with the same input is arranged close by after the division of the gated circuit.
However, when the placement region is designated for each gated circuit, the problem occurs that the area increases and a timing limit violation of the logic circuit occurs. In addition, when a circuit modification is generated to a cell with a larger drive ability so as to satisfy the timing between flip-flops, the area and the power consumption further increase.
Conventional clustering has the purpose that the delay time from the clock source to the flip-flop becomes minimum and the skew becomes uniform, and uses an evaluation function where the capacitance of each cluster is uniform and the sum of the capacities of the whole cluster becomes the minimum, therefore, the problem occurs that clusters do not necessary comprise only the gated cells with the same input when the placement region of the flip-flop belonging to the gated circuit is designated.
Reviewing the above described problems, it is the purpose of the present invention to provide a layout method of a semiconductor integrated circuit device which can reduce the skew of a clock signal between flip-flops via a gated cell from a clock source and can control the power consumption of the clock signal part.
A layout method of a semiconductor integrated circuit device according to the present invention is a method for designing a layout of a semiconductor integrated circuit device including one gated circuit comprising a group of elements having gated cells connected to clock sources and being connected to clock sources via gated cells, as well as a gated clock circuit having another group of elements connected directly to clock sources. Therefore, the method includes the following processes.
In a net list modification process, since the other group of elements are treated as other gated circuits, the net list of the gated clock circuit is modified to a net structure to which cells for correcting the number of stages are added between the clock sources and the other group of elements.
In the process for generating gated circuit division information, the division number for each gated circuit is determined so that the delay value becomes uniform and the drive ability of the cell for circuit division is allocated for each gated circuit by selecting the drive ability of the cell for circuit division in accordance with the total load capacitance of each gated circuit based on the result of placement according to the net list after modification and/or rough routing.
In the gated circuit division process, each gated circuit is divided to form a plurality of clusters by carrying out clustering based on the information generated at the process for generating gated circuit division information and the cell for a circuit division, which has the drive ability allocated in the process for generating gated circuit division information, is inserted, respectively, into the position where the load capacitance of each cluster becomes uniform.
In the gated cell division process, the same number of gated cells for circuit division as that of clusters are allocated for each gated circuit and the drive ability of the gated cells for circuit division is selected so that the delay value becomes uniform in accordance with an input capacitance of the cell for circuit division, and then each gated cell for circuit division is inserted in the vicinity of each cell for circuit division.
In the gated cell front stage CTS process, a hierarchical tree is generated between the clock source and each gated cell for circuit division in the clock tree system.
According to this method, the skew of the gated clock circuit can be reduced and the division number for the gated circuit can be made small in relation to a conventional layout method of a semiconductor integrated circuit device for the gated clock circuit by positively using the cells with the same logic and different drive abilities, even when the number of elements connected to gated cells, such as flip-flops, becomes inconsistent. That is to say, since the number of stages in the clock tree and the number of buffers between the clock source and the gated cell can be made small, the delay value or the skew from the clock source to the element can be reduced. It is not particularly necessary to designate the placement region for every element connected to each gated cell, therefore, problems such as area increase or timing limit violations of the logic: circuit rarely occur.
In the above described method of the invention, in the case that inputs of the gated cells for circuit division belonging, to the same cluster are formed solely of all the same clock nets after the gated cell front stage CTS process and/or at the time of implementation of the gated cell front stage CTS process, a process for optimizing the gated cell position may be included, which moves the gated cells for circuit division belonging to the same cluster to the front stage of the cells for circuit division positioned on the side of clock source away from the gated cells for circuit division belonging to the same cluster.
According to this method, cells or wires which can stop the operation at the time when a stop signal is inputted to the gated cell for circuit division increase by moving the gated cell for circuit division to the clock source side and, therefore, the power consumption of the gated clock circuit can be reduced.
In the above described method of the invention or in a method of the invention further having a gated cell position optimization process, it is possible to allocate the cell corresponding to an inverter cell as a cell for circuit division , at the time of the process for generating gated circuit division information by replacing the gated cell of the gated clock circuit and the cell for correcting the number of stages inserted at the net list modification process, respectively, with an inverted cell:for inverting logic.
According to this method, since the number of gate stages can be reduced, the delay value and the power consumption from the clock source to the element such as a flip-flop can be reduced.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, an upper limit maybe set for the wire capacitance of each cluster as the division condition of each gated circuit at the time of the process for generating gated circuit division information.
According to this method, a skew by routing can be controlled by giving the upper limit to the wire capacitance of the cluster.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, an evaluation function may be used where the load capacitance difference between each cluster and the sum of the load capacitance of each cluster satisfy the preset capacitance value, the kinds of attributes of different cells at least forming each cluster are minimum in number and a priority is placed on a cluster of which the number of cells for each attribute forming each cluster at the time of the gated circuit division process and/or clustering in the gated cell front stage CTS process.
According to this method, a cluster formed only by the cells with the same attributes can be effectively generated. For example, in the case of application to the gated clock circuit, since a cluster formed only by the gated cell for circuit division with the same input can be effectively generated, cells or wires which can stop the operation at the time when a stop signal is inputted to the gated cell for circuit division increase.
Therefore, the power consumption in the gated clock circuit can be reduced. In addition, it is not particularly necessary to designate the placement region for every element such as a flip-flop connected to each gated cell for circuit division and, therefore, problems such as area increase or timing limit violations of the logic circuit rarely occur.
In the case where the above described evaluation function is applied to a circuit design method referred to as a useful skew, which makes a circuit design at high speed possible by allowing the element such as a flip-flop to have a plurality of delay times intentionally, a cluster formed only of elements such as flip-flops to which the same delay time is allocated can be generated effectively so as to make it possible to facilitate the adjustment of the clock skew.
It is possible to adjust the drive ability of the cell for circuit division inserted into each cluster in the case that an evaluation function is used where the load capacitance difference between each cluster and the sum of the load capacitance of clusters satisfy the preset capacitance value, the kinds of attributes of different cells at least forming a cluster are minimum in number and a priority is placed on a cluster of which the number of cells for each attribute forming a cluster at the time of clustering is large.
According to this method, the number of cells belonging to the cluster, the increase of the skew due to the placement spread of the cells, or the possibility where cells with different attributes may mix, can be reduced compared to the case where the drive ability of the cell for circuit division inserted into the cluster is not adjusted.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, the net structure may be modified, with respect to the gated cell for circuit division which forms the cell for forming the clock tree inserted to each cluster after clustering and each cluster, by using a set of cells which is formed of one or more first cells having the same logic circuit as that of gated cells for circuit division forming each cluster and a second cell for driving without modifying the logic and to which one of the input terminals of the first cells and the input terminal of the second cell are connected.
According to this method, particularly in the case that cells with different attributes mix in the cluster and the majority number of cells forming the cluster is occupied by the cells with one attribute, it becomes possible to move the gated cell for circuit division to the side of the clock source so that the power consumption of the gated clock circuit can further be reduced, even in the case that the gated cell for circuit division can not be moved to the clock source side according to the previous method of the invention.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, the net structure may be modified, with respect to the cell for forming the clock tree inserted to each cluster after clustering and the gated cell for circuit division which comprises each cluster, by using a circuit which is formed of one or more first cells having the same logic circuit as the gated cells for circuit division forming each cluster and a second cell for driving without modifying the logic and to which one of the input terminals of the first cells and the input terminal of the second cell are connected.
According to this method, since the positional relationship between one or more first cells having the same logic circuit as the gated cell for circuit division and the second cell for driving without modifying the logic is arbitrary, a gated cell or a buffer cell can be inserted into a position which makes the skew the minimum for every cell of each attribution and, therefore, the skew can further be made smaller compared to the case where they are collected.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, a process for optimizing gated cell placement coordinates may exist which adjust the placement coordinates of the gated cell for circuit division after the gated cell division process.
According to this method, the skew from the gated cell to the element can be reduced by optimizing the placement coordinates of the gated cell for circuit division. In addition, the spread of the placement of the gated cell for circuit division can be made smaller according to the optimization of the placement coordinates of the gated cell for circuit division, therefore, the skew from the clock source to the gated cell can be reduced.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, in the case of the structure where cells for circuit division which drive without modifying the logic from the gated cell for circuit division or cells for forming the clock tree are connected in series, and/or in the case of the circuit structure where cells for circuit division which drive without modifying the logic and/or two or more cells for forming the clock tree are connected in series, the cells for circuit division which drive without modifying the logic and/or the cells for forming the clock tree may be replaced with feed-through cells or may be reduced within the range where the skew from the clock source to each element of each element group is ranged into a desired skew value.
According to this method, it becomes possible to eliminate a redundant logic circuit forming the clock tree and, therefore, the power consumption of the gated clock circuit can be further reduced.
In the above described method of the invention or in a method of the invention further having a process for optimizing the gated cell position, it is possible to combine all or a part of the above described optional terms. In this case, the effects of each of the combined optional terms are achieved.