1. Field of the Invention
This invention is related generally to the field of microprocessor design and more particularly to transferring an early response signal and corresponding data between clock domains in a microprocessor where the clock domains have fixable clock ratios.
2. Description of the Related Art
In simple computer systems, a single clock signal may be used to run all of the devices which are integrated into the chip. As shown in FIG. 1, a system PLL (phase locked loop) 11 may provide a clock signal to a microprocessor 12, a memory 13 and a peripheral device 14 via clock line 16. The signal is used to clock data transfers between the devices on bus 15.
While implementation of the system illustrated in FIG. 1 is simple and relatively straightforward, its simplicity results in some performance limitations. One of these limitations relates to the variations in the clock signals which are seen by the various devices on the chip. The use of a network of conductive traces to deliver the clock signal to each of the devices causes reflections, noise and other variations in the signals. These factors cause differences in the signals arriving at the different devices, which may in turn limit the devices"" ability to communicate data. For example, if there is a skew between the clock signals arriving at two devices, a value being communicated between the devices may have to be asserted by the transmitting device for a longer time than would otherwise be necessary in order to ensure that the value can be sampled by the receiving device.
In the simple system illustrated in FIG. 1, a data transfer involves two devices in the same clock domain. (xe2x80x9cClock domainxe2x80x9d refers to a portion of a system in which the operation of the devices is based on a particular clock signal.) Therefore, the operation of each of the devices is based upon the same clock signal. In the absence of any clock skew, data being transferred from one of these devices to the other must be asserted for a particular amount of time before the data is sampled (the setup time) and a particular amount time after the data is sampled (the hold time.) If there is any skew between the clock signals at each of the devices, the data must be asserted long enough to account for this difference. While this additional time may not be significant in relation to slower clock speeds, high-performance, high-speed microprocessors have shorter clock periods, and data transfers may not be able to keep up with the speed of the processor.
Clock forwarding is one technique which can be used to minimize the impact of clock skew and allow improved performance in data transfers. In a clock forwarding scheme, the data bus and system clock described above are replaced by point-to-point data and clock signals. When data is to be transferred from one device to another, the data is transferred along with a corresponding clock signal. This is illustrated in FIG. 2. The data is typically clocked into a series of storage locations (e.g., flip-flops) by the transmitting device according to the forwarded clock signal. The data is then clocked out of the storage locations by the receiving device according to a local clock signal. Both of the clock signals must have the same rate, but a substantial skew in the signals will not prevent reliable transfer of the data.
While clock forwarding provides a means to transfer data between devices operating at the same clock rate, it is often desirable in modern computer systems to use different clock frequencies for different devices. For example, it may be useful to operate the core logic (i.e., the microprocessor logic) and the system logic at different frequencies. The difference in frequencies allows for advances in the performance of one type of logic without requiring equal advances in the other type of logic. Thus, for example, the processor speed can be increased without having to also speed up the system logic.
In these systems, system logic is closely tied to the system bus. As a result, the system logic usually operates at a frequency which is an integer (or half-integer) multiple of the system bus frequency. Because the system logic operates at a frequency which is a multiple of the system bus frequency, clock signals for the system logic and clock signals for the system bus can both be generated from the same reference clock. If the core logic also runs at a frequency which is an integer or half-integer multiple of the system bus frequency, it can also be easily generated. For example, if the system bus is running at 66 MHz, the system logic and core logic can be operated at 200 MHz (three times the system bus frequency). Then, if desired, the frequency of the core logic can be scaled up to 266 MHz (four times the system bus frequency), while the system logic remains at 200 MHz.
As the operating frequency of the system bus increases, however, it becomes more and more difficult to scale up the speed of the core logic because this would require a larger increase in the frequency. For example, if the system bus is running at 400 MHz and both the core logic and the system logic are running at 800 MHz, the core logic cannot be easily scaled up to 900 MHz. That is, 900 MHz is not an integer or half-integer multiple of the system bus frequency. It may therefore be useful to have multiple clocks instead of a single one.
The use of multiple clock domains in a computer system may create a number of problems which must be addressed in the system. One problem is that it is difficult to communicate between two clock domains in which the clocks are not integer or half-integer multiples of each other. Another problem is that it is sometimes desirable to transmit a warning signal (or early response signal) to allow a receiving device to quickly respond to data from a transmitting device. When these devices are in different clock domains having different clock rates, it is difficult to accurately anticipate the transmission of the data based on the early response signal.
One or more of the problems described above may be solved by the various embodiments of the invention. Broadly speaking, the invention comprises a system and method for using an early response signal to transmit data with a fixed latency where the data is being transferred from a device in a first clock domain having a first clock rate to a device in a second clock domain having a second clock rate.
In one embodiment, clock-skipping techniques are used to enable communication between devices in two different clock domains having different clock rates. If the clock rates of the two clock domains were equal, one bit of data could be transferred on each clock pulse. Since, however, the clock rates of the two clock domains are not equal, transmitting and receiving one bit of data on each clock pulse would cause the number of bits transmitted to be different than the number of bits received. In other words, either the transmitted data would overrun the received data, or the received data would overrun the transmitted data. Clock pulses are therefore periodically skipped in the faster clock domain so that the number of bits transmitted is equal to the number of bits received.
In one embodiment, a first device transmits an early response signal prior to transmitting data. The early response signal is intended to provide a warning to the receiving device that data will be transmitted after a certain number of cycles. In other words, the data is transmitted with a fixed latency after transmission of the early response signal. While this latency is a fixed number of clock cycles in the transmitting device, the latency (in clock cycles) may not be the same in the receiving device (which may include both valid clock pulses and skipped pulses.) In order to provide a fixed-latency early response signal from one clock domain to the other, it is necessary to account for the skipped pulses.
One embodiment of the present system comprises a first device and a second device between which data is transferred using clock skipping techniques. The first device operates at a slower clock rate than the second device. The first device transfers an early response signal to the second device and, after a fixed number (k) of pulses, transfers data to the second device. The second device comprises a component that generates a skip pattern which defines each clock pulse as either a valid pulse or a skipped pulse. This component also generates a value (n) indicating the number of skipped pulses that will occur between receipt of the early response signal and the occurrence of the kth valid pulse. Upon receiving an early response signal, the second device delays for a time equivalent to the n skipped pulses before counting the k-pulse latency and then receiving the transmitted data.