1. Field of the Invention
The present invention generally relates to VLSI microprocessors, and more particularly to timing logic within a microprocessor. Still more particularly, the present invention relates to clock signal distribution within a microprocessor.
2. Description of the Prior Art
In current high-speed microprocessor systems, the on-chip clock design has become a much more significant and important design problem than in any past VLSI microprocessor chip design, because the clock frequency target is 1.1 Ghz, which means less than 1.0 nanosecond total chip cycletime. Uncertainties or delay skew in the clock signals arriving at the latches and registers detract directly from the cycle time.
The target for clock skew is to be less than 125 picoseconds locally and less than 250 picoseconds globally. 250 picoseconds is 0.25 nanoseconds, or 25% of the total cycle. This then deserves a major design effort to reduce the skew to allow more time in the cycle for the necessary logical operations.
Making the task more difficult is the larger chip size required to hold all of the processor circuitry and arrays. The clock must be distributed to every latch on the chip.
Typical clock distribution issues relate to problems in systems with a central clock buffer and systems with repowered clock distribution.
Central Clock Buffers
There are many problems related to a central clock buffer such as that found in the DEC ALPHA chip. Among these problems is increased skew or delay uncertainty in the distribution from the central buffer to the load point as the chip size is increased. The total load capacitance has increased dramatically as chip size and the number of gates has increased.
In these conventional system, the chip clock utilizes a centrally located buffer for the clock signal. While this has design advantages, the increased power necessary to drive the clock over the entire chip, as chip size has increased, has become problematic. The centrally located buffer switches all the loads from a central physical location on the chip thus causing a very large concentrated xe2x80x9cdelta-Ixe2x80x9d problem in that area that can collapse the power source at that point causing nearby circuits to fail. It is also difficult to construct clock routes away from the buffer with an adequately low impedance.
Central clock distributions feature a single, an relatively large, central clock buffer driving a low-resistance clock grid. The grid, in turn, drives the end-user circuits directly. Central distributions are attractive for several reasons, including the fact that all of the circuit-delay-based variation is contained in the central buffer and is mostly common to all chip circuits which receive the clock signal, and that large numbers of parallel paths in the clock grid tend to average out wire-delay-based variation to individual circuits.
Because of the inherent delay in the transmission of the clock signal, clock time is xe2x80x98curvedxe2x80x99 across the chip, i.e., circuits get the clock later as the distance from the central buffer increases. This aspect of signal distribution is carefully evaluated and exploited in the machine design.
There are, however, significant drawbacks to the use of a central clock buffer. These include, for example, the fact that the large, centralized buffer stresses the central power distribution and requires a considerable area of decoupling capacitance nearby. Further, transmission line resistance and inductance limits how effectively a load can be driven by a remote buffer, no matter how large the central buffer is. A large grid cross-section is required to keep grid sheet resistance well below driver resistance. The return path must be just as substantial from both a resistance and inductance point of view.
Repowered Clock Distribution
Repowered clock distributions use a much smaller central buffer, local clock regenerators, and possibly sector buffers in between. Repowered distributions are found in many processors manufactured by, for example, the IBM Corporation. Repowered distributions are have several advantages over a central clock buffer in that the central buffer is modest and presents no hazard to the local power distribution. Moreover, less wiring cross section is required since most loads are a shorter distance from their drivers. Further, unlike the clock-time curve found in central-buffer distribution systems, circuits all over the chip get the clock at nearly the same time.
Repowered distribution also has some significant problems, however, including the circuit-delay-based variation necessarily found the distributed drivers, which is not an issue in central-buffer distribution systems, since the circuit delay in a central-buffer is common to the entire circuit. Further, since several stages of repowering or redrive is used, some very long wires are required, with a significant wire delay time constant caused by the RC time constant of the wires themselves.
Other considerations of both approaches include those of delta-I problems and thermal problems. As the clock signal is distributed, a self-induced switching noise caused by inherent transmission-line inductances. This noise is generally known as Delta-I noise. Delta-I noise is a function of conductor length, conductor spacing, conductor diameter and assigned voltage levels. See, for example, xe2x80x9cDelta-I Noise Specification for a High-Performance Computing Machinexe2x80x9d by George A. Katopis, in Proceedings of the IEEE, Vol. 73, No. Sep. 9, 1985, which is hereby incorporated by reference.
Heat, of course, is a consideration in any processor, and both central-buffer systems and repowered distribution systems must accommodate any thermal generation.
Clock distributions have several physical properties that must also be considered. For example, clock distributions are comprised of amplifier chains and distribution wiring. From a PLL-like clock source to final circuit clock load they must develop a gain of approximately 100,000. For minimum clock latency (delay through the clock distribution) optimal gain per simple inverter stage is approximately 3, so approximately 10 inverters are required to develop this gain.
Consequently, jitter is insensitive to clock load. Fairly large changes in clock load (3xc3x97) change the latency of the amplifier chain by only one stage or xcx9c10%. This means clock latency and its associated jitter change at most 10% for 3xc3x97 change in clock load. Further, most clock power is in the last stage. Gain chains have a strongly skewed distribution of power consumption, starting with a whisper and ending in a bang. Each successive stage dissipates 3xc3x97 the power of the previous stage. In a 10 stage chain, 66% of the power is dissipated switching the final load. 22% is dissipating switching the input capacitance of the 10th inverter. 7% is used by the 9th inverter. Only 1% is used by the 7th inverter and earlier stages are negligible.
Because clock power in early stages is negligible, considerable capacitance and power inefficiency can be tolerated in the 1st through 7th stages with little increase in clock power. Design of these early stages can use extra capacitance and power to decrease sensitivity to design and process.
It is therefore one object of the present invention to provide an improved VLSI microprocessors.
It is another object of the present invention to provide improved timing logic within a microprocessor.
It is yet another object of the present invention to provide improved clock signal distribution within a microprocessor.
The application presents an on-chip clock distribution system which utilizes local clock buffers to provide an improved clock signal distribution while avoiding the disadvantages of conventional central buffer and repowered distribution systems. The disclosed system also reduces distribution routing problems and distributes xe2x80x9cdelta-Ixe2x80x9d problem and thermal problem, and also supports better local delay tuning.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.