1. Field of the Invention
The present invention relates to a low skew clocking system for VLSI integrated circuits, and more particularly, to a clocking network in which one chip has a local clock generator circuit which generates the synchronization signals for other chips on a common PC board.
2. Description of the Prior Art
In the modern competitive environment for data processing systems, the processing speed and hence performance doubles every year or so. Hence, engineers are constantly searching for new ways to improve the processing speeds of their systems in order to remain competitive. A typical way to improve processing speed is to increase the clocking frequencies of the systems and sub-systems Increasing the clock frequency can improve performance nearly linearly for typical data processing systems by reducing the cycle time. However, data processing systems can only function as rapidly as their hardware permits, and as a result, there are limits as to how much the clocking frequencies may be increased. In addition, as cycle time is shortened for a given hardware configuration, less skew in the clocking signals may be tolerated In other words, the time delay or offset between interacting signals synchronized to the system clock as a result of IC fabrication variances causes propagation delay variations and rise time/fall time variations which must be reduced to achieve higher performance.
Integrated circuit chips synchronized to a system clock of the data processing system generally have different propagation delays due to inherent variations in chip fabrication which may cause increased clocking signal skew during the operation of the system. As a result, the maximum clocking signal frequencies for a given data processing system are limited by the differences in chip speeds for chips driven by the clocking signal. Clocking signal skew is particularly troublesome when a very fast chip uses the same clocking signal as a very slow chip. Such skew is made worse by the typical processor clock variations of the clocking signal generator. The maximum frequency of the clocking signals is further limited in that the worst case tolerances for setup and hold times of the integrated circuits responsive to the clock signal must be respected.
In typical data processing systems, the different integrated circuits are synchronized to the system clock. Generally, the system clock is a single synchronizing signal that is driven to all chips in the data processing system simultaneously. However, such a single synchronizing signal has a large clock skew since any chip synchronized to the synchronizing signal can be the fastest or slowest chip in the system. For example, a central processing unit synchronized with a cache controller could be faster than the cache controller or vice versa. In such systems, the clock skew may be approximated by 2*(max-min), where max and min are the maximum and minimum clock generator delays on the chips. Those skilled in the art have attempted to minimize this clock skew by minimizing the difference between the maximum and the minimum clock generator delays on the chips. However, for a large data processing system in which many different chips are driven from the same synchronizing signal, such an approach is impractical. Moreover, such systems must be carefully designed to avoid race conditions which occur when the single synchronizing signal is delayed in one of its paths such that a signal driven from one chip to another in the critical path is received after its synchronization signal was received when the circuit was designed to receive that signal before the synchronization signal was received. Such an occurrence cannot be tolerated by the system, and accordingly, the cycle time of the system clocking signal is generally made great enough to prevent such race conditions from occurring. Of course, such an extension of the cycle time adversely affects processing speed and performance.
Other clocking techniques have been proposed to prevent the aforementioned race conditions without extending the clocking cycle. For example, quadrature clocking signals, which are dual edged clocking signals delayed by 90.degree., 180.degree., 270.degree. and 360.degree. with respect to each other, have been used. When quadrature clocking signals are used, a rising edge for synchronization purposes is received every quarter cycle. By so minimizing the non-overlap time between the clocking pulses, races are prevented. However, such clocking signals have heretofore been useful only for race management and have not provided for measurable performance enhancements, for even when quadrature clocking signals are used, the aforementioned processing speed limitations are still present.
For the above reasons, clocking circuits employed on high-performance VLSI processors and similar applications have had tightly controlled delay specifications in order to allow high-frequency synchronous communication between different integrated circuits in the system without violating setup and hold constraints. This control has been accomplished by using phase locked loop techniques to align the clock signal edges to a reference edge, or by designing an absolute delay which is small enough that the variation within the system is within acceptable limits. The former technique is very sensitive to the effects of noise and processing variations on the essentially analog control circuitry, while the latter technique is much more straightforward but produces lower performance systems. It has been recognized that by minimizing the delays in the system, as by placing components adjacent to one another so as to minimize PC board trace length, less clocking skew results. However, such systems are still subject to limiting paths in which the propagation delay is substantial. Moreover, by running the integrated circuits of such systems from a single system clock, circuit propagation delay variations and the like still adversely affect clock skew performance limits with the resulting adverse effect on system performance.
Accordingly, prior art data processing systems are subject to hardware limitations as to the maximum frequency at which the system may be driven and hence the maximum performance attainable from the system. However, substantial improvements in processing speed may still be made by taking into account these hardware limitations of the system. It is thus desirable to develop a clocking system which optimizes performance for a given hardware configuration by minimizing skew. The present invention has been designed to meet this need.