In high speed supercomputers of the type produced by Cray Research, Inc., the Assignee of the present invention, a system clock signal is distributed throughout so as to control timing of events within the computer. Clock signals are typically distributed from a single distribution point to various destination points within the computer, which may be located some distance apart. For reasons which will be discussed later, the signals do not arrive at all destination points at exactly the same time. The difference in time between these arrivals is called skew. On slower computer systems, skew is usually a very small portion of the system clock period and is thus likely to be insignificant. However, on faster computers with faster system clocks, the same amount of skew may be a substantial portion of the clock period and may actually limit the speed at which the computer can operate. Additionally, in a physically larger computer system, the distances between destination points and the distribution point can vary dramatically, resulting in increased skew.
A typical path for a clock signal will include fanout gates, circuit board foil paths, integrated circuit (IC) interconnect metal, and wires. Each of these provides an opportunity for introducing undesired clock skew. The amount of time it takes a signal to travel along a wire or foil path is called its electrical length, and it depends upon its physical length and its capacitance. All else being equal, a signal will take longer to travel a long path than a short one. If the physical lengths of all the clock signal paths are not equal, skew is introduced.
Clock signal paths will often include several levels of fanout gates and buffering. Skew results if there are unequal numbers of gates in the signal paths or if there are variations in how long it takes a signal to pass through various gates. How long it takes a signal to pass through a gate depends upon several factors, including the propagation delay characteristics of the particular type of gate, the number of loads the gate is driving, and the temperature of the gate. Any variation of these factors between two signal paths will cause skew in the signals. Even if these factors are identical, there may be variations between individual gates of the same type.
Crosstalk from adjacent signals can be another cause of clock skew. For example, if during a transition from one logic state to another, a signal's voltage level is altered by crosstalk, then the point in time when the signal is determined to have switched will be altered, thus introducing skew.
Another source of skew is when the logic level is determined by reference to a power supply voltage. For example, if the logic levels are defined as voltages relative to ground, any noise on a logic gate's ground reference will affect when the gate determines an input signal to have switched.
There are several reasons for attempting to eliminate as much skew as possible. First, it limits the speed at which a system can run. Within a computer, tasks are often performed serially, with data being passed from one stage of the computer to another on subsequent clock cycles. The time period of the clock must be long enough to account for the time it takes a stage to process the data and propagate it to the next stage. In addition, the clock period must also allow for any skew between the clock signals at the various stages. For example, if one stage is clocked late due to clock skew but the next stage is clocked on time, the data from the first stage may not yet be present when the second stage is clocked. The clock period thus must be stretched to accommodate not only the time needed for the first stage to process and propagate the data, but also for the amount of skew between the clock signals present at the two stages. The system clock can thus be sped up by the amount of any skew that can be eliminated.
Similarly, clock skew may prevent a system from being slowed down. It is often desirable to slow down a system clock for diagnostic purposes, but if slowed down too much, the system may no longer function. For example, if one stage is clocked on time but the next stage is clocked late due to clock skew, the data from the first stage may no longer be present when the second stage is clocked.
Supercomputers are designed modularly, with circuitry placed on various removable circuit boards or modules. The presence of clock skew in the system may reduce the ability of a board or module to be interchanged from one machine to another. Since the amount of time it takes a signal to propagate through a particular type of logic gate varies from gate to gate, the amount of clock skew seen on a particular module may be different from that seen on other modules. A system designed to accommodate skew present on typical modules may not work with all modules, especially those where the amount of skew differs substantially from that of the typical module. The result is that some modules may not function in all machines.
There are several techniques used to attempt to reduce clock skew. The designer can attempt to equalize the wire and foil trace lengths between the clock source and all destinations. This is often accomplished by distributing the clock signals radially from a distribution point physically located near the center of the machine. The designer can also equalize the number of gates and types of gates in all clock signal paths. Clock skew can also be reduced by equalizing the amount of load that gates and various signal paths must drive. Since these techniques affect the fundamental architecture of the system, they can only be performed during the design of the system.
There are also delay introduction techniques that can be performed during the manufacture or installation of the system. What is important is the difference in delay between the various signal paths, not the actual amount of delay in any given path. Thus, skew between signals can be compensated for by introducing a specific amount of delay in the faster signal paths so as to match the electrical length of the slowest signal path. One example of such a technique is delay line tuning, as disclosed in U.S. Pat. No. 4,165,490. Delay line tuning involves connecting a clock signal path through a delay line which provides multiple outputs, each corresponding to a different delay amount. The output corresponding to the needed delay is selected at the time of installation by reference to other clock signal paths.
Another example of a delay introduction technique is foil path select tuning. With this technique, several alternate foil paths are provided on a circuit board, each tuned to a different electrical length and thus a different delay. The foil path corresponding to the delay needed to skew compensate the signal is selected by applying either a solder bridge or wire so as to connect the foil into the signal path.
The effectiveness and practicality of these methods varies. Equalizing trace lengths, numbers of gates, and loading must be done during the initial design phase, and thus cannot account for design changes or component variations. Equalizing the number of gates in the path and the gate loading may not be possible in all circumstances due to other design constraints of the circuit. Additionally, a previously equalized circuit may require the addition of new circuitry not conceived of during the initial design phase. Previously equalized paths may no longer be equalized after the addition of new circuitry.
Delay line tuning is adjustable, and so it can skew compensate a circuit even after the addition of new circuitry. However, delay lines typically have resolution in the nanosecond range. For high speed supercomputers, much more-precision than this is necessary.
Foil path select tuning provides the sub-nanosecond resolution required, however, the tuning procedure involves soldering, tab welding, or some other method of making the required electrical and mechanical connection. These techniques are not easily automatable, and repeated adjustment risks damaging the circuit board due to the handling required or by the lifting of foils during the operation. In addition, foil path select tuning cannot be implemented in a single IC package due to the need to make a mechanical connection.
It is clear that there has existed a long and unfilled need in the prior art for a clock de-skewing technique capable of reducing skew to the sub-nanosecond range. The present invention solves these and other shortcomings of the techniques known in the prior art.