The invention relates to clock distribution schemes for supercomputers and future generation microprocessors. Presently, supercomputers predominantly employ an asynchronous communication scheme which does not require a distributed clock even though scaled to hundreds of nodes,
There are two conventional clock distribution schemes in the prior art used to distribute a clock from a central clock driver to distant nodes. The central clock signal is distributed to the nodes using (1) a stubbed network as shown in FIG. 1a, or (2) a star network as shown in FIG. 1b. In the stubbed network, a clock driver 11 feeds a clock signal to the farthest node 13 in the network, and the intermediate nodes 15 are connected in-between using taps. The star network has central clock driver(s) 21, and the clock driver(s) feeds a clock signal to each node 23 in the system using a separate net. A combination of the two schemes is also possible.
A stubbed network is easy to implement, but the quality of the clock signal distributed to the nodes is poor, resulting in clock jitter, thereby reducing the performance. Since a single clock driver feeds the clock signal to a large number of nodes, the size of the clock driver is also very large. A star network, on the other hand, may not degrade the clock signal as much, but it is difficult to layout. In this scheme as well, the clock drivers also become large, which limits the number of nodes which can be connected in this fashion.
Both prior art networks exhibit limited scaleability. As the number of nodes increases, the clock signal must be cascaded, with frequent buffering, which shrinks the clock pulse. If it is buffered too often, the clock pulse even disappears. For a given frequency of the clock, there is a limit on the number of stages of buffering which can be used before the clock pulse disappears. On the other hand, if the clock pulse is not buffered often, and the number of nodes is large, the clock signal degrades, which causes jitter and limits the performance.
For these reasons, prior art clock distribution schemes impose a practical limit on the number of nodes in a network. As it becomes desirable to build supercomputers with thousands of nodes, the prior art techniques for clock distribution are inadequate.
To obtain improved performance, the next generation of supercomputers need to use a synchronous communication scheme. Therefore a practical and reliable clock distribution scheme is essential. The next generation supercomputers will also scale to thousands of nodes, and the prior art clock distribution schemes are not adequate for this purpose.