Computer systems are often advertised according to various characteristics of the processor, particularly the internal clock frequency of the processor. Typically, the processor clock has a frequency that is an integer multiple of the bus clock frequency. Although a processor is usually capable of performing internal operations at the advertised fast clock speeds, in many cases the processor clock is too fast for the bus and peripheral devices. Therefore, the processor communicates with the peripheral devices only at the slower speed of the interface bus clock. Even in a system-on-chip (SOC) device, the processor is limited to the slower clock frequency during data transfers.
From a timing perspective, the difference between frequencies can cause problems if the processor is not informed of the timing characteristics of the slower bus clock. Assume, for example, that a processor clock is running at a speed of four times the speed of a bus clock. In this case, the processor would be capable of transferring data at any one of the four active edges during the one clock cycle of the bus clock. However, the bus would expect to begin communications when the processor clock and bus clock are synchronized, or, in other words, when their active or leading edges are aligned. To utilize the entire bus clock cycle, the processor should begin data transmissions at the start of the bus clock cycle. If this relationship between clock cycles is not taken into account, then timing issues for other peripheral devices communicating on the bus can arise, thereby slowing down the operation of the computer system. Therefore, it is desirable for the processor to transfer information in synchronization with the leading edge of the bus clock.
To handle the timing of the processor clock with respect to the bus clock, the processor must know the location of the leading edge of the bus clock in order to synchronize to this edge. One conventional solution has been to determine the ratio between the bus clock and the processor clock during power up. Then, this ratio is maintained during operation from that point forward. A problem with this methodology is that the computer system is confined to this single clock ratio. The clocks cannot be adjusted dynamically as necessary in order to reduce power or enhance performance.
Another solution for synchronizing a processor clock to a bus clock has been to provide a phased locked loop (PLL) device in the processor to constantly resynchronize the processor clock to the bus clock. The PLL device receives a low frequency signal, which is used for the bus clock. From the low frequency signal, the PLL device generates a higher frequency signal, which is used for the processor clock. The downside of PLL devices is that it is difficult to design a PLL circuit effectively in this configuration. Also, PLL devices are expensive and take up a relatively large area on the silicon chip.
A third solution to locating the edge of the bus clock has been to provide a centralized clock control circuit. FIG. 1 is a block diagram of a clocking system 10, such as one that may be configured on a system-on-chip (SOC) device. The clocking system 10 includes a centralized clock control circuit 12 for generating a processor clock intended for a processor 14 and a bus clock intended for peripheral devices 16. The bus clock is supplied along path 18 and the processor clock is supplied along path 20. The centralized clock control circuit 12 also provides a control signal along path 22. The control signal is configured to indicate which clock edge of the processor clock is associated with the next rising edge of the bus clock.
In reality, the bus clock and processor clock are distributed to thousands of destinations. Since it would be impractical to design a single driver to drive this large number of elements, an industry standard clock tree insertion tool is used to create a clock tree 24. The clock tree 24 includes several branches, and smaller branches branching from the larger branches, and so on, branching out to thousands of flip-flops (not shown) or other sequential elements having clock inputs driven by the clock signals. Each branch includes one or more buffers for properly driving the clock signals to the flip-flops, or “leaves” of the clock tree.
The buffers, however, inherently cause a delay from the centralized clock control circuit 12 to the flip-flops. Therefore, the clock tree 24 is also designed to balance the delays of the bus clock and processor clock from the centralized clock control circuit 12 to the destination devices. This delay is referred to as the “insertion time”. The bus clock supplied along path 18 reaches the leaf level of the clock tree 24 along path 26, which carries the insertion-delayed bus clock signal. Also, the insertion-delayed processor signal is carried along path 28.
The problem with this technique, however, is that the clock tree insertion tool for inserting the branches and buffers typically cannot manage to create a similar delay structure for the control signal along path 22. If there is a skew in the phase of the control signal with respect to the processor clock and bus clock at the leaves of the clock tree, then the control signal will not properly indicate the start of the bus clock cycle as intended. Normally, the control signal typically experiences fewer delays. Thus, after running the insertion tool, a chip designer must manually insert delay elements in the layout to match the control signal with the clocks. The problem with this technique is that this modification to the layout can be difficult and time-consuming. Also, manual adjustments are subject to human error, which is typically greater than the error of automated insertion tools.
FIG. 2 is a timing diagram showing the timing of the signals in FIG. 1. The first three signals are the bus clock, processor clock, and control signal, each generated by the centralized clock control circuit 12 at the root of the clock tree 24. In this example, the frequency of the processor clock is four times greater than the bus clock. The centralized clock control circuit 12 may count down the cycles of the processor clock with respect to one cycle of the bus clock, e.g. from 3 to 0 in this example. The bus clock pulse is high on the 3 and 2 counts and is low on the 1 and 0 counts. On cycle 0 of the processor clock, the control signal is generated to indicate the start of the new bus clock cycle. The control signal then goes low fairly quickly after the start of the new bus clock.
FIG. 2 also includes fourth and fifth timing signals showing the bus clock and processor clock delayed by the insertion time. These clocks are seen by the flip-flops at the leaves of the clock tree 24. A downside of this prior art technique using the centralized clock control circuit 12 is that the control signal can be skewed from the bus clock and processor clock at the leaves and fail to properly synchronize the clocks. As mentioned above, manual adjustments must be made to deskew the control signal.
Thus, a need exists in the industry to address the aforementioned deficiencies and inadequacies of the prior art. More specifically, a need exists to provide a circuit that requires less design effort to deskew the control signal, eliminates the element of human error, and operates more effectively for synchronizing the processor clock with the bus clock.