1. Field of the Invention
The invention involves clock signal distribution in digital systems, and more particularly, a method and apparatus for accurately measuring clock skew such that proper tuning apparatus may then be inserted.
2. Description of Related Art
Typical data processing apparatus comprises a plurality of latch points connected together by data paths, control signal paths, and combinatorial logic. As used herein, the terms "data path" and "control signal path" refer to those conductors and other apparatus which couple the data output of each source latch point to the data input of a destination latch point. Data paths and control signal paths may pass through combinatorial logic which may modify the signal on the data path or control signal path, often in response to signals on a different data path which also passes through the same combinatorial logic, but by definition cannot pass through another latch point. Such a latch point defines the end of one data or control signal path and the beginning of the next. Also, because of the equivalence between control and data signals, the two terms are used interchangeably herein.
A clock signal generated at some common point in a system is typically distributed to the latch points throughout the system, causing each latch point to perform some operation, for example, latching input data to outputs. Desirably, the operation occurs at all latch points at the same time. In large scale systems, operating at extremely high clock frequencies, special care must be taken to ensure that the clock signal arrives at all of the latch points at exactly the same time, or at least within some very tight tolerance. Otherwise, two different problems will limit the maximum clock frequency at which the system can be safely operated.
The first problem, known as the long path problem, arises on the longest data paths in the system. Assume that in a particular data path from a source latch point to a destination latch point, the data path delay is equal to t.sub.d and the clock skew between the source and destination latch points has a value of t.sub.s. If the data path is very long, and if the clock frequency f is too high, then it may be that t.sub.d &gt;1/f+t.sub.s. In other words, if the source latch clocks in new data on a first clock pulse, that new data (possibly as modified by combinatorial circuitry in the data path) may not reach the destination latch point before the next clock pulse reaches the destination latch point. Thus the destination latch may clock in old data. To avoid this, therefore, the clock frequency must be slow enough to accommodate the longest data paths and largest clock skews in the system.
The second problem, known as the short path problem, arises when the data path delay from a source latch point to a destination latch point is shorter than the clock skew t.sub.s between the two latch points. Assume that the latch points consist of master-slave flip flops, in which data at the input is looked internally in the latch on the leading edge of the clock pulse and transmitted to the output of the latch on the trailing edge of the clock pulse. Assume further that prior to a particular clock pulse, the source latch has a first logic value on its output and a second logic value on its input. Finally assume that the first logic value has been present on the output of the source latch for a long enough period of time such that it (itself or as modified by combinatorial logic) is also present at the input of a destination latch point.
When the leading edge of a clock pulse arrives at the source latch, the second logic value is looked internally in the latch. The second logic value is then transferred to the output of the source latch on the trailing edge of the clock pulse. Ideally, where clock skew between the two latches is minimal, the destination latch will lock in the first value when the leading edge of the clock pulse arrives there. If the clock skew between the source and destination latch points is large enough, however, it is possible that the leading edge of the clock pulse will not arrive at the destination latch until after the second logic level arrives from the source latch. In this case the destination latch will erroneously lock in the second logic level instead of the first.
The short path problem is even more acute if the latch points are of the type which open for data-flow therethrough during the period between the leading and trailing edges of a clock pulse and look the output data only on the trailing edge of the clock pulse. In this situation, the destination latch point will lock in the wrong data whenever the new data, which appears on the output of the source latch point in response to the leading edge of a clock pulse, arrives at the input of the destination latch point at any time prior to the time when the trailing edge of the same clock pulse arrives at the destination latch point. The problem in this situation is alleviated by minimizing the width of each clock pulse, but it can never be improved beyond the still-problematic situation of master/slave flip flops described above. Again, therefore, the largest clock skew between source and destination latch points places a limit on the maximum safe clock frequency of the system.
Clock skew is typically minimized through the two-step process of first measuring it and then inserting appropriate delays in the shorter clock distribution paths. These steps are typically performed at the board level, from a clock source point on the board to the clock input of each integrated circuit on the board, and also on a system level, from some system clock source point to the clock source points of each board in the system. Clock tuning is usually not needed between different latches fabricated on the same integrated circuit chip. In large-scale, high-performance computer systems, clock tuning must typically be performed for each individual system before its leaves the factory. The present invention relates to the measurement step in this process.
In the past, clock skew measurement has been performed using, illustratively, the apparatus shown in FIG. 1. Shown in FIG. 1 is a printed circuit board 10 which forms part of a larger data processing system. It includes thereon a clock source point 20 connected to the input of a distribution tree 22 having multiple outputs. The clock source point 20 is the point on the board 10 to which the master system clock will eventually be connected. Each of the outputs of distribution tree 22 is connected to a clock input of a different integrated circuit chip on the board. Two of the chips are shown, namely chips 24 and 26. Chip 24 has fabricated thereon its own distribution tree 28 having its input connected to one of the outputs of distribution tree 22, and having a plurality of outputs connected to the clock inputs of various latches on the chip 24. For example, one of the outputs of distribution tree 28 is connected to the clock input of a latch 40. Similarly, chip 26 has fabricated thereon a distribution tree 42 having its input connected to one of the outputs of distribution tree 22 and further having a plurality of outputs. One of the outputs of distribution tree 42 is illustratively connected to the clock input of a latch 44 on the chip 26. Further, in order to permit clock tuning, a probe point is provided for each chip on the board to which the clock signal is transmitted. These probe points are each connected via buffering circuitry to an otherwise unused output of the clock distribution tree on the corresponding chip. In FIG. 1, probe point 46 is connected to an output of distribution tree 28 via buffering circuitry 47, and probe point 48 is connected to an output of distribution tree 42 via buffering circuitry 49.
When it is desired to measure the clock skew between the two chips 24 and 26 (assumed to be the same as the clock skew between the latches 40 and 44), a variable frequency oscillator 30 is first coupled to the clock source point 20. A frequency counter 32 is also coupled to the output of the oscillator 30. A two-channel oscilloscope 34, having delayed triggering based on the first channel, then has its first channel input coupled to probe point 46 on chip 24 and its second channel input coupled to probe point 48 on chip 26. One of the chips, say chip 24, is designated a reference chip and the other, chip 26, is designated the subject chip. For the purposes of this description, the reference chip 24 is assumed to be closer to the latch source point 20 than is the subject chip 26.
The traces that will be seen on the oscilloscope will be substantially like those shown in FIG. 2a. The solid line in FIG. 2a is the trace due to the signal at the probe point 46, and the dotted line is the trace due to the signal at probe point 48. The time difference between them is equal to the sum of (a) the difference between the clock signal path delay from the clock source point 20 to the reference probe point 46 and the path delay from the clock source point 20 to the subject probe point 48, and (b) the known and fixed difference between the time for the signal at reference probe point 46 to reach the oscilloscope and the time for the signal at subject probe point 48 to reach the oscilloscope.
In order to measure the time delay between the two traces, the frequency of the oscillator 30 is adjusted until the two traces are exactly 180.degree. out of phase, as shown in FIG. 2b. At this clock frequency, the delay between the traces is exactly equal to half the period of the clock signal. Since that period is exactly one divided by the frequency shown on the frequency counter 32, the difference between the (source-point/reference-probe-point/oscilloscope) path delay and the (source point/subject-probe-point/oscilloscope) path delay can be calculated. From this the difference between the (source-point/reference-latch-probe-point) path delay and the (source-point/subject-latch-probe-point) path delay can be calculated as well. The latter difference is taken as the clock skew between all of the latch points on reference chip 24 and all of the latch points on subject chip 26.
The above method of clock tuning works well, but does have limitations. First, the frequency at which the measurement is made depends on the actual value of clock skew; there is no easy way to make the measurements at the frequency at which the clock signal will actually be transmitted through the system. Since signal path delays are often frequency-dependent, the measurement obtained by the above method will be off by some amount because it was not taken at the ultimate operating frequency.
Secondly, it is very time consuming to manually tune the clock for each subject chip. The process may be automated, but the time required to adjust the frequency of the oscillator 30 and allow the frequency counter 32 to settle would still be comparatively long.
A third problem with the above method arises because of the need to probe signals on the board. For one thing, direct probing of such signals could potentially change them and cause erroneous results, and for another, the high density of modern boards makes it difficult to find space for sufficient numbers of probe points. A known solution is to provide the board with high input impedance selection logic to route all the clock signals to a small number of probe points in response to an external selection signal, but such selection logic itself introduces errors. Moreover, as circuit density increases and more and more points need to be tuned, the selection logic becomes more complex and worsens these errors.
It is therefore an object of the present invention to overcome some or all of the above problems.
It is another object of the present invention to provide a method and apparatus for measuring clock skew between two or more latch points in a system.
It is another object of the present invention to provide a method and apparatus for measuring clock skew between two or more latch points at different levels in a clock distribution tree in a system.
It is another object of the present invention to provide a method and apparatus for measuring clock skew between two or more latch points in a system on separate early, normal and late clock distribution trees.