In the design of very high performance integrated circuits, designers have to deal with the distribution of clock frequencies in the multi GHz domain over large chip areas while attempting to keep clock skew as low as possible. The chip area is considered large if, for a given technology, to propagate a signal from the center of the chip to the farthest edge it takes several clock cycles. The signal propagation takes into account optimal buffering and use of good wire resources. Today's large chips are in the order of tens of millimeters in width and/or height. Most design approaches follow a multi-stage style of clock distribution design. The first step is to divide the chip into smaller areas. The second step is to distribute the GHz clock signal from the PLL (phase locked loop) to these smaller areas. One design approach is to use an H-Tree to distribute the clock signal. However, this approach creates too much uncommon logic between clocks paths. To minimize that, shortening of buffer outputs at different stages of the tree is performed. However, due to the nature of H-tree, shortening is only effective at the beginning of the tree. At later stages because they are physically apart, shortening is partial within the branch of the H-tree. In another design approach the size of the H-tree is reduced by increasing the small areas the chip is divided into. This approach requires another stage of global distribution, which can be for example a clock grid. This reduces the amount of uncommon clock logic between clock paths within the H-tree, at the expense of uncommon logic between areas driven by the second stage of the tree. It would still be possible to reduce uncommon logic by connecting the clock distribution networks (CDNs) at the boundaries of the divided areas. That reduces skew at the boundary but does not affect the skew inside the area.
Clock skew is defined as the difference between two delay values measured at well defined locations in the clock distribution network, usually the inputs of gates at the same level of a distribution tree. The skew is relevant if the locations are driven by the same common source, because it measures the difference it takes for the clock to reach both locations from a common launch point. Clock skew is an important design parameter because, if not properly managed, it can cause speed slow downs (lower clock frequency) or circuit mal-functions. As such, clock skew must be controlled to avoid adverse effects. One very common design technique to control/reduce adverse impact is to minimize clock skew by designing clock distribution networks that target very low clock skew (in the single digit range).
The total clock skew between any two locations driven by the same source can be classified in two types. One is called static clock skew or the amount of skew that is obtained from the simulations of the designed clock distribution network. Parameters affecting this value are the types of buffers used in the clock distribution network, the style (or styles) of clock distribution, the types of wires used to distribute the signal and how they are laid out in the design, the accuracy of simulation models for devices and wires, the accuracy of parasitic extraction of resistance, capacitance and inductance, etc. This type of clock skew is one of the design parameters used by a designer to guide the implementation of large area clock distribution networks. For example, one clock distribution network designed to operate at 1.5 GHz over an area of 21×21 mm was simulated under the above conditions and gave design skew no higher then 1.1 ps of late mode clock skew (and 1.9 ps early mode clock skew) between any 240 pre-defined grid locations across the chip area.
The other source of clock skew is known as dynamic clock skew, because its value varies with operating conditions of the chip as well as fabrication uncertainties. In deep sub-micron design technologies geometric dimensions are not absolute values but are defined by a nominal value plus or minus a variation. For a given batch of wafers fabricated at the same time it is possible to get chips within a wafer or chips across wafers where the wire implementation for the same net has different dimensions between chips (such as variable width, cross-section and length). Likewise, wires designed with the same dimensions at different locations within a chip may have the dimensions vary after fabrication. These geometric variations also impact devices, and because clock buffers are usually much bigger than other devices they are particularly susceptible to these variations. Since large area clock distribution networks may contain hundreds and up to thousands of buffers and tens of thousands of wires it is not possible to create simulation scenarios that create all possible geometric variations that may occur during fabrication. One way to account for these geometric variations is to create a cross-section simulation model, perform worst case simulation scenarios and use the results as additional clock skew the circuits driven by the global CDN must account for. This value becomes a budget used to set the timing boundary conditions for timing analysis.
Dynamic clock skew may also be due to chip operating conditions. Device operation is susceptible to temperature variation. Operating temperature is a function of the environment as well as the types of operations performed within the chip. A large chip contains in excess of 1 billion transistors. Any percentage of transistors switching at the same time dissipate power which changes temperature and operating conditions of the devices within the area of switching activity. Global CDNs, because covering the entire chip area are susceptible to temperature variations due to almost any switching activity in the design. Furthermore, the global CDNs also cause temperature variations because of the thousands of large buffers constantly switching at GHz frequencies. Again, simulation of such conditions is beyond any real scenarios that can be conceived. Likewise the dynamic skew due to geometric variations, the uncertainty due to temperature changes is factored into the budget mentioned before. For the design example running at 1.5 GHz there were two budgets of clock skew set. One budget for early mode and the other budget for late mode timing analysis. This budget accounts for both the static and all forms of dynamic clock skew. To allow a reasonable amount of the total budget to dynamic skew, it was defined early in the project that the design of the CDN should not exceed half of the total budget (for both early and late mode budgets) in simulations. This requirement is the main driver for the new design techniques presented in the present invention.
Other factors taken into consideration during the design of large area CDNs to reduce the effects of static and dynamic skew are the total area on the chip where multi GHz clock signals must be distributed to, the style of distribution of the signal, and the number of high frequency clock signals. The present invention addresses the cases where multi GHz clock signals are widely used throughout the chip. If the design uses other frequencies, a common requirement in high performance microprocessors, these frequencies can be derived from the main frequency and are usually used in targeted small areas of the chip.
The present invention takes into account Regular CDNs where the clock signal can freely be distributed to cover the whole chip area. However, the design technique disclosed herein can also be applied to Irregular CDNs where the distribution of the clock signal is constrained to areas of the chip before it reaches the final location. This scenario is characteristic of chips using multi clock frequencies or chips where the clock signals cannot cross over large areas.