1. Field of the Invention
The present invention relates to logic synthesis and, more particularly, to a method and metric for low power logic synthesis using standard cells.
2. Description of the Related Art
The number of standard cells that are present on current-generation CMOS chips number well into the millions. As CMOS design rules continue to shrink, the number of on-chip standard cells will continue to increase. At the present time, the largest CMOS chips contain over 10 million cells. With next-generation chips, it is a virtual certainty that the number of on-chip standard cells will exceed 100 million.
Even though CMOS device scaling and VCC scaling are decreasing the average power dissipation per standard cell, the sharp increase in the total number of on-chip standard cells is causing total on-chip power dissipation to drastically increase. This large power increase is rapidly becoming the limiting factor in determining the maximum amount of logic functionality that can be put onto a single CMOS chip. Thus, in order to build more complex CMOS chips, the total power dissipation, due to all of the on-chip standard cells, must be reduced to an absolute minimum.
Most of today's complex CMOS chips are designed by writing high level code in a hardware description language, such as Verilog™ or VHDL. This high level code is then parsed and broken down into logic gates (standard cells) by a logic synthesis tool, such the Design Compiler™ available from Synopsys® Incorporated.
The standard cell netlists produced by all synthesis tools must meet three strict criteria: (1) the netlists must produce the desired logic behavior, (2) the netlists must meet all of the system timing requirements, and (3) the netlists must contain minimal standard cell area. The system timing requirements include set up and hold time for flipflops and latches, and chip input-to-output delays.
It is important to note that the above synthesis criteria attempt to minimize total standard cell area, not total standard cell power dissipation. In other words, most logic synthesis tools try to minimize total standard cell area while simultaneously providing adequate positive timing slack for all signal paths. Thus, when the area cost is low, logic synthesis tools have a built-in propensity to choose faster cells, increasing power dissipation.
For example, many standard cell libraries contain low drive strength cells (such as X1/X2 strength inverters) that occupy the same chip area. Thus, if two cells occupy the same chip area and one of them offers less delay, the logic synthesis tool will always choose the faster cell, even though the smaller delay may not be needed. This results in higher power dissipation because faster cells contain larger transistors that dissipate more power.
In light of the limitations described above, there is a definite need for a synthesis metric and a synthesis method that minimize total standard cell power dissipation, instead of total standard cell area. Furthermore, although this synthesis metric and synthesis method need not produce absolute minimum standard cell area, they must nevertheless produce reasonably small standard cell area.
Except for DC leakage current, standard CMOS logic gates do not dissipate any DC power. Hence the average power dissipation of standard CMOS logic gates is due to their AC switching activity only. Therefore, in order to minimize standard cell power dissipation, the average power dissipated by a standard CMOS logic gate must be quantified, as described in detail below.
FIGS. 1A and 1B show equivalent logic/circuit diagrams that illustrate a prior-art logic gate, inverter 100. As shown in FIGS. 1A and 1B, inverter 100 has a single logic gate input and a single logic gate output. The logic gate input has an associated logic gate input capacitance CIN, and the logic gate output has an associated logic gate output capacitance COUT.
Referring to FIGS. 1A and 1B, the logic gate input capacitance CIN is mostly due to the gate oxide capacitances of the transistors connected to the logic gate input(s). In other words, since logic gates have physically small dimensions, their internal parasitic capacitances are very small in comparison to their internal transistor gate oxide capacitances.
The logic gate output capacitance COUT only includes the internal parasitic capacitances present at the logic gate output node. In other words, the logic gate output capacitance COUT does not include the fanout capacitance that is being driven. (The fanout capacitance includes the capacitances associated with the logic gate inputs that are being driven, plus the capacitances of the interconnect wires that are being driven).
As shown in FIGS. 1A and 1B, for most logic gates, CIN is much larger than COUT. Hence, for power computation purposes, the logic gate output capacitance COUT can usually be ignored. Thus the AC power dissipation of inverter 100, which is an example of a simple CMOS logic gate, can be calculated from equation EQ.1 as follows:Average AC power=CIN*VCC2*FAVG  EQ.1where CIN is the logic gate input capacitance of inverter 100, VCC is the power supply voltage, and FAVG is the average switching frequency.
Referring to EQ. 1, all logic synthesis tools are cognizant of the CIN value at each input of each standard cell. Furthermore, these synthesis tools are also cognizant of the global VCC value. Nevertheless, these synthesis tools are still unable to accurately calculate and minimize total standard cell power dissipation because: (1) the tools do not account for power dissipation due to the parasitic gate capacitances of internal transistors that are not directly connected to the logic gate inputs, and (2) the tools are unaware of the value of FAVG, which depends upon input vectors (waveforms) that are unspecified during logic synthesis. (As described below, FAVG actually has two components, one for clock paths and one for data paths.)
For example, all flipflops contain internal transistors whose gate capacitance values are ignored during logic synthesis. Furthermore, all flipflops contain two paths: a clock path and a data path. Although the clock frequency is known during logic synthesis, the total internal clock capacitance is unknown (ignored), and the total internal data path capacitance is also unknown (ignored). In addition, the average data path frequency is unknown (ignored) because it depends upon input vectors (waveforms) that are unspecified during logic synthesis.
Another complication stems from the fact that the power dissipation of a CMOS logic gate depends upon its input capacitance, which can be different for each input pin on a given logic gate.
FIGS. 2A and 2B show equivalent logic/circuit circuit diagrams that illustrate a prior-art complex logic gate 200. Complex logic gate 200 is an and-or-invert (AOI) logic gate.
As shown in FIG. 2B, the input capacitance on pin A is greater than the input capacitance on pin D because pin A has an equivalent device width of seven (4+3), whereas pin D only has an equivalent device width of five (4+1). This equivalent device width difference occurs due to the required differences in device sizing due to differences in device stacking. Thus, even if the same voltage waveform were applied to both pins, the power dissipation on pin A would be greater than the power dissipation on pin D, because pin A has a higher capacitance than pin D.
In summary, present-day logic synthesis tools do not do a good job of minimizing total standard cell power because the tools are not cognizant of the key parameters that affect standard cell power, and the tools also lack a suitable metric that would allow them to minimize total standard cell power.
When it comes to minimizing total standard cell area, most logic synthesis tools are highly efficient. In other words, when total standard cell area is chosen as the metric to be minimized, these tools do an excellent job. However, with regard to minimizing total standard cell power, most synthesis tools only allow the user to indirectly minimize power. This is usually done by employing techniques such as: (A) clock gating, (B) halting switching activity in logic blocks when the blocks are not being used, and (C) using power down signals or “sleep” signals to decrease switching activity in different modes of operation.
Furthermore, CMOS chip designers and CMOS process engineers have also employed a variety of physical design techniques to minimize power, including: (A) scaling the process design rules, (B) reducing the power supply voltage, (C) using more efficient (lower capacitance) routing, (D) reducing the transistor count, and (E) reducing the transistor sizes. Present day power reduction techniques also include: (F) increasing the number of available drive strengths, (G) reducing the flow-thru current (crowbar current), (H) reducing the average timing slack, (I) creating customized cell instances, (J) using high VT cells to reduce DC leakage current, and (K) using multiple VCC voltage levels.
Although the above physical techniques can be highly effective, they still do not allow the user to directly minimize total standard cell power dissipation during logic synthesis. Thus, as stated above, there is a definite need for a logic synthesis metric and a logic synthesis method that minimize total standard cell power dissipation, while simultaneously producing reasonably small standard cell area.