1. Field of the Invention
The present invention relates to the field of electronic design automation (EDA). More specifically, the present invention relates to the field of techniques for reducing power consumption within integrated circuits that can be designed using a computer controlled EDA system.
2. Related Art
Electronic design automation (EDA) systems are a form of computer aided design (CAD) systems and are used for designing integrated circuit (IC) devices. The EDA system typically receives one or more high level behavioral descriptions of an IC device (e.g., in hardware description language such as VHDL or Verilog) and translates this high level design language description into netlists of various levels of abstraction. At a higher level of abstraction, a generic netlist is typically produced that can be translated into a lower level technology-specific netlist based on a technology-specific library. A netlist describes the IC design and is composed of nodes (elements) and edges, e.g., connections between nodes, and can be represented using a directed cyclic graph structure having nodes which are connected to each other with signal lines. A single node can have multiple fan-ins and multiple fan-outs. The netlist is typically stored in computer readable media within the EDA system and processed and verified using many well known techniques. One result is a physical device layout in mask form which can be used to directly implement structures in silicon to realize the physical IC device.
Often, during the many optimizations and refinements of the netlist design, the power consumed by the netlist design becomes an important consideration for an IC designer. The IC designers desire to reduce the power consumed by various netlist designs in order to satisfy frequently specified low power consumption constraints for their circuits. Low power consumption constraints can be relevant for a number of different applications. For example, the resulting IC device might be used in a portable device having limited battery life, or, the IC device might be integrated within a system in which heat dissipation is a critical factor, etc. The supply of IC devices for portable (e.g., battery powered) components is a large and growing market segment including hand-held communication and computing devices as well as portable computer systems. For a number of commercially important reasons, not the least of which is routine energy conservation, designers want to reduce the power consumed and dissipated by their IC devices.
One technique for power consumption reduction is called operand isolation, an example of which is shown in circuit 10 of FIG. 1A. One implementation of this technique is described by A. Correale, Jr., in a paper entitled, "Overview of the Power Minimization Techniques Employed in the IBM PowerPC 4xx Embedded Controllers," published in 1995 by the International Symposium on Low Power Design (ISLPD) at Dana Point, Calif. The concept within operand isolation is to isolate the input operand signals of a functional unit during those clock cycles when the output of the functional unit is not propagated through the netlist (e.g., it is not used by the netlist and does not alter the primary outputs of the IC device).
Circuit 10 of FIG. 1A includes four functional units 12, 14, 16 and 18 implemented in circuitry. The input operand signals originate from an operand bus 30. These circuits 12, 14, 16 and 18 consume power when their inputs transition, whether or not their outputs are used. Without operand isolation, the circuits 12,14, 16 and 18 concurrently execute each clock cycle and a single output is selected among them by multiplexer 20 and propagated. Power is needlessly wasted because only one functional unit's output is propagated by multiplexer 20 per clock cycle.
However, with operand isolation as shown in FIG. 1A, each operand signal must pass through an operand latch circuit 40a, 40b, 40c and 40d which only allows passage when its corresponding functional unit's output is selected by the multiplexer 20. Operand signals only pass through circuit 40a when signal t1 is active (c1 is #t1); operand signals only pass through circuit 40b when signal t2 is active (c2 is #t2); operand signals only pass through circuit 40c when signal t3 is active (c3 is #t3); and operand signals only pass through circuit 40d when signal t4 is active (c4 is #t4). Signals t1 through t4 originate from the select inputs of multiplexer 20 which selects only one of the outputs from circuits 12, 14, 16 and 18 for any given clock cycle. Signals t1 through t4 are used by circuits 40a-40d to isolate the operands of three of the functional unit circuits for each clock cycle and allow only one functional unit circuit to operate. By isolating the operand inputs as described above, the functional unit circuits that produce unneeded results are disabled and do not needlessly consume power.
The problem with circuit 10 is that the signals t1-t4, which control the operand isolation circuits 40a-40d, originate from existing circuitry of the underlying circuit. In most cases, designers cannot rely on isolation signals originating from existing circuitry of the underlying circuit. For instance, these signals t1-t4 exist whether or not operand isolation is applied to the functional units 12, 14, 16, 18. In many cases, there may not be a suitable signal (to use for operand isolation) existing within the underlying circuit, or, the signals existing within the underlying circuit may not give the isolation coverage desired by an IC designer. In effect, the signals available to control isolation circuits may isolate the operands of a functional unit circuit only during a small subset of the instances where the function's output is ignored. In this case, only a fraction of the total possible power savings is achieved.
Another prior art method of operand isolation is described in a paper entitled, "Guarded Evaluation: Pushing Power Management to Logical Synthesis/Design," published in 1995 by the ISLPD at Dana Point, Calif. by V. Tiwari. Tiwari describes a circuit having transparent latches that make up guard logic to perform operand isolation. The latches control the passage of input operand signals to arithmetic functional units (e.g., shifters, adders, etc.) In a pass mode, the latch allows the operand signals to pass through and in a non-pass mode the latch holds its previous value to prevent new operand signals from reaching the arithmetic functional unit. The guard logic is controlled by a signal, s, which is based on the observability of the output of the arithmetic functional unit. Like Correale, Tiwari uses an existing signal from the underlying circuit to achieve the signal, s. Specifically, Tiwari uses ATPG (Automatic Test Pattern Generation) tools to find the existing signal to couple as signal s.
Because Tiwari is limited by controlling the guard logic with an underlying signal that already exists within the netlist, Tiwari is limited in two ways. First, the duty cycle or duration of operand isolation coverage available for each node is limited and, second, Tiwari is limited in the number of nodes to which his operand isolation can be applied at all. For instance, FIG. 1B illustrates a set 64 of all conditions under which an arithmetic functional unit generates an output that is not needed (e.g., an observability don't care condition). By using only a signal that exists within the netlist to generate signal s, Tiwari is limited to only a subset 62 of set 64 where subset 62 represents power savings achieved and set 64 represents total possible power savings. In this manner, operand isolation coverage of set 62 is limited. Further, using the concepts of ATPG and observability may not even result in an existing signal that can be used for isolation coverage with respect to particular nodes. In this case, under Tiwari, operand isolation would not even be applied to these particular nodes because there exists no signal to control the isolation logic. Therefore, no power savings is achieved for these nodes.
As described above, power optimizations previously presented in literature have mostly targeted smaller parts of a design, such as localized combinational logic or a set of sequential elements. Very few transformations have been applied to entire design entities such as finite state machines (FSMs) or pipelined data paths as a whole, contrary to the generally accepted belief that optimizations on higher level of abstractions will yield the highest power savings.
In particular, pipelined designs have been considered unattractive candidates for clock gating and operand isolation techniques for power savings because the registers between pipeline stages are enabled in each cock cycle and therefore do not present clock gating/isolation opportunities. In the past, clock gating for power savings has been applied to pipelined designs only by enabling or disabling the entire pipelined design. For instance, in a transmitter/receiver circuit device, when the device is receiving, all of its transmitting data path circuits can be clock gated for power savings. Likewise, when the device is receiving, all of its transmitting data path circuits can be clock gated. This form of pipelined design clock gating totally shuts down the pipeline circuit in order to save power therein. Heretofore, clock gating has not been applied to an operating pipelined design. What is needed is a better power savings approach that is applicable to operating pipelined designs.
Accordingly, what is needed is a mechanism and method for applying power savings technique to a pipelined design that does not require the pipelined design to be totally shut down during the power savings mode. In effect, what is needed is a power savings technique that can be applied to the stages of a pipelined design while simultaneously allowing the pipelined design to operate and process data path information. The present invention provides these advantageous functionalities. These and other advantages of the present invention not specifically mentioned above will become clear within discussions of the present invention presented herein.