Large high-performance very large scale integration (VLSI) chips have an internal clock signal that is a function of an external clock signal. This internal clock signal must be distributed to a large number of clock pins, which are specific locations or metal shapes on the chip, each of which has a known or estimated effective pin capacitance. The frequency of the clock signal determines the frequency and cycle time of the chip. Shorter cycle times result in higher chip frequency and improved chip performance. Clock skew can limit achievable cycle time, reducing chip performance. Clock skew within a chip is the difference in time that the internal clock signal reaches various parts of the chip. Specifically, the phrase clock skew as referred to herein, is the total maximum difference in clock arrival times between any pair of the clock pins. Clock skew can also refer to a subset of clock pins, where it refers to the maximum difference in arrival times between pins in that subset. Clock skew can further be separated into two components: 1) Nominal clock skew is the expected, difference in clock signal arrival times obtained from modeling and simulation; 2) Clock uncertainty refers to the unknown and random differences in clock signal arrival times. Since clock uncertainty is random and uncertain, statistical methods are used to predict total clock skew from the nominal clock skew and the clock uncertainty. The phrase local clock skew, refers to the clock skew between any small subset of nearby clock pins within a small area, where the area is a small fraction of the total chip size.
The nominal clock skew, if known early enough in the chip design process, can be taken into account in the chip circuit design, and does not necessarily increase cycle time. However, if this nominal skew changes significantly during the design process, it will usually cause an increase in cycle time. In addition, large nominal skew usually results in larger clock uncertainty, and clock uncertainty virtually always results in increased cycle time. Significant local clock skew is considered especially detrimental, especially if it is largely due to random clock uncertainty. This is because large local clock skew is more likely to cause functional errors in the chip outputs even at low clock frequencies, rendering the chip worthless.
The main contributors to nominal clock skew are: localized capacitance loading or capacitance load density variations across the chip; wire length differences causing differences in wire signal transmission times; and known wire environmental variations due to wires near the clock wires causing differences in clock wire capacitance or inductance.
The main contributors to clock uncertainty skew are: errors or uncertainties in wire transmission-line simulation models; uncontrolled variations in transistor parameters; uncontrolled process variations in wire properties such as width, space to adjacent wires, wire thickness, and inter-level dielectric constants; unknown or incorrectly modeled wire environmental variations; power supply and temperature fluctuations; and capacitive or inductive noise coupling from other wires switching at similar times.
By way of overview, present techniques are based on either tree-based networks, grid-based networks, or specific combinations of trees and grids. Tree-based networks control clock skew by relying on balanced routing methods that attempt to match wiring delays by equalizing wire lengths and loads, and sometimes by tuning wire widths to reduce nominal skew and/or uncertainty skew. This balanced routing can be difficult due to competition for chip wires in certain regions, especially when variable width wires are used. This balanced routing must also change every time any clock pins move or pin capacitances change significantly due to design changes. This makes it difficult to complete other chip wiring jobs until the final clock distribution wiring is complete. Tree based methods that match wire length by lengthening wires or matching effective pin capacitances by adding capacitance to small pin capacitances are relatively wasteful of wiring and power, since chip power increases with capacitance. Tree-based networks are susceptible to large local clock uncertainty, because a variation in any single wire property can cause significant local skew. It is also difficult to control wire transmission-line effects such as inductance and capacitance for complex tree networks, because it is difficult to route shields or transmission-line return paths adjacent to every clock wire in complicated balanced trees.
Grid-based networks control clock skew by wiring an X-Y grid of clock wires. This grid is then driven at a small number of places by large buffers, often at the edges and/or centerline of mesh. This results in very small local skew because the mesh is well connected within any local region. Unfortunately, this method uses a large amount of wiring tracks and wiring capacitance to achieve low skew. In addition, the nominal skew across the chip can be large, because the regions of the chip near the buffers receive the clock signal before the regions of the mesh far from the buffers. Also, simulation and tuning of grid-based networks is difficult because of the large number of interconnected wires and loads, resulting in a large amount of time and computer resources for a single accurate simulation, making tuning slower and more difficult.
FIG. 1A depicts an example of a network where a central re-powering buffer [112] receives the clock signal and distributes it using a treelike network [113] to lines of buffers at the 4 edges of the figure. These lines of buffers [110] drive all four edges of an X-Y wiring grid network [111]. This network reduces local skew significantly, but has significant skew due to the center of the grid receiving the clock signal later than the edges of the grid. There are also signal integrity problems with the signal wave-form at the center of the chip significantly different than that at the edges, resulting in additional effective skew in the circuits receiving these differing wave-forms. The large number of wide grid wires needed to reduce skew results in more wire capacitance and higher chip power.
FIG. 1B depicts an example of a X-Y grid tree with multiple levels of H-trees [140] and buffers to drive the grid [135] at more points to reduce the skew. However, to implement this network it is necessary to place the final buffers [130] at a large number of locations on the chip internal to the grid, which is inconvenient and often is impractical due to large design blocks such as memory arrays and data-flow blocks that cannot easily accommodate internal placement of clock buffers. See for example, U.S. Pat. No. 5,656,963, issued Aug. 12, 1997, entitled "Clock Distribution Network for Reducing Clock Skew," by Masleid et al., ["Masleid"], which uses symmetric H-trees driving a large number of buffers, which then drive an X-Y grid directly at a large number of locations. This requires a large number of buffer locations to minimize skew on the grid that occurs between buffers, and it is often impossible to find places on the chip for these buffers due blockages from large circuits such as memory blocks or data-flow circuits. Conversely, if the number of buffers driving the grid directly is reduced, the transmission time through the grid wires causes late clock arrival times at the points farthest from the buffers.
Accordingly, there is a need for a system and method for reducing the clock skew in a clock distribution method, including nominal skew and clock uncertainty skew (including both local clock skew as well as total clock skew), without using large amounts of wire or large numbers of clock buffers. There is a need for the system and method to allow flexible buffer and wire placement around blockages, and simple wire routing with easily designed shields to provide capacitive shielding and transmission-line return paths. The placement of most wires and buffers needs to be determined early in the design process, but the network should be adaptable to provide low skew even when clock buffer locations, clock pin locations, and pin capacitances change significantly during the design process. The present invention addresses these needs.