The present invention pertains to the field of computer systems. More particularly, this invention pertains to the field of improving performance and saving power in a collapsable pipeline using gated clocks.
The realities of achieving more and more performance from today""s complex graphics devices includes the high cost of dealing with thermal issues. Higher performance generally is obtained through more transistors in more complex devices as well as greater clock speeds. The greater clock speeds and the larger circuitry serve to increase performance but also contribute to greater power consumption and therefore greater thermal issues. The high cost of dealing with thermal issues is generally incurred through the use of thermally enhanced device packaging, heat sinks, and extra air flow via fans. This increased cost is of course undesirable. Graphics device manufacturers seek techniques to reduce power consumption while increasing performance in graphics devices. If a graphics device consumes less power, less cost will be incurred in dealing with thermal issues.
In a typical graphics device, perhaps about 20% of the device is involved with display refresh. Further, about 20% of the graphics memory bandwidth is involved with display refresh. This means that about 20% of the time, about 80% of the graphics device will be forced into an idle state. Graphics devices typically include a number of pipelines, such as 2D and 3D pipelines. During a display refresh period, these pipelines are stalled because the output of the pipelines are unable to access the graphics memory.
Prior graphics devices have sought to reduce power consumption by gating the clock signals to the various pipeline stages that make up the pipelines during display refresh periods. If a pipeline stage does not receive a clock, it cannot xe2x80x9cclock inxe2x80x9d or receive new data. Because data paths between pipeline stages may be typically as many as 128 bits wide or more, a significant power reduction can be obtained by gating the clocks to the pipeline stages during display refresh periods.
When a typical graphics device pipeline is stalled due to display refresh, one or more pipeline stages may contain invalid data, thereby creating xe2x80x9cbubblesxe2x80x9d in the pipeline. Prior graphics devices have sought to increase performance by collapsing or eliminating these bubbles during display refresh periods. The technique used involves circuitry that allows each pipeline stage to know whether any of the pipeline stages further downstream contains invalid data. If any downstream stage contains invalid data, the current pipeline stage can know to clock in data from its upstream neighbor because its downstream neighbor is sure to clock in the data presently stored in the current pipeline stage. This technique can therefore collapse a single pipeline bubble in a single clock period.
However, the prior technique for collapsing bubbles in a pipeline becomes impracticable or impossible as pipeline depths increase. Because the circuitry involved must allow each pipeline stage to know the data valid status of each downstream stage, the amount of circuitry involved increases geometrically as pipeline depths increase. Further, the prior technique fails to make efficient use of the display refresh periods by trying to collapse the pipeline bubbles in a single clock period.