1. Introduction
While microprocessor speeds have historically doubled with every new processor generation, power consumption of circuit blocks in the microprocessors has gone up by six orders of magnitude during each new processor generation. Even with processor operating voltage reduction and capacitance reduction coming from new manufacturing processes which shrink transistor sizes, chip power consumption is still growing at a rate of three orders of magnitude per processor generation. This growth in power consumption is largely due to an increased use of on chip hardware to improve parallelism and improve microprocessor performance. In addition, to get extra performance on certain critical timing paths, device sizes are being increased to get shorter delays at the circuit level. However, size optimization of all transistor sizes in a given design is very time consuming, and often, the penalty of upsizing transistors to get performance boosts comes at the expense of a much larger increase in circuit power consumption.
To achieve further performance increases in very critical arithmetic and control circuitry, designers are converting a larger portion of the static lower power portion of the chip to more power hungry dynamic (which includes domino) blocks to attain the very aggressive delay specifications dictated by the chip architecture. Therefore, the use of dynamic logic is becoming more prevalent and an increasing part of microprocessor circuit designs. It has been demonstrated that dynamic or domino logic consumes three times more power than static complementary metal-oxide-semiconductor ("CMOS") designs. However, for some delay range, some domino designs can be made static at the same performance point, and power optimizations can become possible under these circumstances.
Register transfer language ("RTL") to schematic partitioning has also made the power-delay optimization problem more difficult for designers. Without proper knowledge of power-delay tradeoff points at the micro architecture level, circuit designers are forced to upsize entire blocks to meet circuit performance targets. For some designs, however, certain timing can be reallocated to adjacent blocks, and these blocks can then be concurrently downsized and upsized to further achieve a lower power design at the same original delay specification. Unfortunately, while some aspects of recalculating reallocated power designs and delays between blocks has been automated, existing systems still require the designers to manually reallocate the power designs and delays using alternate implementations of the blocks within the design. As the number of blocks and the number of possible implementations for each block both increase, so does the difficulty of manually redesigning and reallocating the power designs and delays. For example, even in a small circuit with only five blocks and three possible implementations for each block there are over two hundred and forty possible configurations of the circuit that can be created. This is too many possible combinations for a designer to manually create and then efficiently and effectively evaluate the desirability of each combination.
High chip power consumption continues to be a major limiting factor for the introduction of new microprocessor designs to the market and as the demand for faster processor operating frequencies continues to increase, chip power consumption problems have only become worse. As a result, currently used power saving techniques are being nullified by the over whelming trend in power increase.
Therefore, new Computer-Aided Design ("CAD") tools and methodologies are needed for the next generations of microprocessor designs to optimize for power-delay or area-delay or both and enable higher productivity from designers during the design cycle.