FIG. 1A is a simplified illustration of a prior art synchronous bus system. The system is described in U.S. Pat. No. 5,432,823, which is assigned to the assignee of the present invention, and is hereby incorporated by reference.
The system 20 of FIG. 1A includes a clock generator 22, a master device 24, and a set of slave devices 26—A through 26—N. A transmission channel is comprised of three components: a clock-to-master path 28, a turn-around path 29, and a clock-from-master path 30. The transmission channel ends at a termination block 31, which may be implemented with a resistor. Each clock pulse from the clock generator 22 traverses from the clock-to-master path 28, through the turn-around path 29, through the clock-from-master path 30, and into the termination block 31. The turn-around path 29 can be implemented as a single package pin to which the clock-to-master path 28 and the clock-from-master path 30 are connected, provided that the created stub is relatively short.
The clock generator 22 is any standard clock source. The master device 24 is a device that can communicate with other master devices and with slave devices, and is located near the turn-around path 29. By way of example, the master device 24 may be a microprocessor, a memory controller, or a peripheral controller.
The slave devices 26 can only communicate with master devices and may be located any where along the transmission channel. The slave devices 26 may be implemented with high speed memories, bus transceivers, peripheral devices, or input/output ports.
In the system of FIG. 1A, a data/control bus 36 (sometimes referred to simply as a data bus 36) is used to transport data and control signals between the master device 24 and the slave devices 26—A through 26—N. This operation is timed by the clock signals on the transmission channel (28, 29, 30). More particularly, the master device 24 initiates an exchange of data by broadcasting an access request packet on the data bus 36. Each slave device 26 decodes the access request packet and determines whether it is the selected slave device and the type of access requested. The selected device then responds appropriately, either reading or writing a packet of data in a pipelined fashion.
In the system of FIG. 1A, the master device 24 transmits data on the bus 36 contemporaneously with clock signals on the clock-from-master path 30. In other words, the transmission of data from the master device 24 to the slave devices 26 is timed by the clock signals on the clock-from-master path 30. Conversely, each slave device transmits data contemporaneously with the clock signal on the clock-to-master path 28. That is, the transmission of data from the slave devices 26 to the master device 24 is timed by the clock signals on the clock-to-master path 28. The scheme of having clock and data signals travel in the same direction is used to reduce clock data skew.
FIG. 1B illustrates timing circuitry used to coordinate the transmission and receipt of signals within a prior art slave device 26. As shown, complementary clock signals CTM and {overscore (CTM)}; respectively on lines 28A and 28B, are received at a differential input buffer 32, the output of which is applied to the reference and phase offset inputs of a Delay-Locked Loop (DLL) 33 to generate an internal transmit clock signal on line 34. Similarly, complementary clock signals CFM and {overscore (CFM)}, respectively on lines 30A and 30B, are received at a differential input buffer 35, the output of which is applied to the reference and phase inputs of a DLL 36 to generate an internal receive clock signal on line 37. This differential buffering scheme is different than the non-differential (single-ended) buffering scheme used for data reception. Thus, prior art slave devices using this configuration are susceptible to timing skew errors between the frequency signal and the data signal.
Returning to FIG. 1A, a problem with this prior art system is that there are impedance discontinuities on the clock loop (28, 29, 30). These impedance discontinuities create standing waves on the clock loop. The standing waves cause a timing shift, which effectively changes the delay from one slave device to another, despite uniform spacing. This problem is more fully appreciated with reference to FIG. 2.
FIG. 2 illustrates the clock signal delay as a function of a slave device's position from the master device. Line 40 illustrates the nominal delay increasing at a uniform rate as the distance from the master device grows. Line 50 illustrates the effective delay created by standing waves on the clock loop. Line 50 demonstrates that different slave devices receive a non-uniform delay, despite uniform spacing. This meandering delay causes timing problems in the reading and writing of data from and to the data bus 36.
One approach to solving this problem is to incorporate calibration circuitry on each slave device 26. The problem with this solution is that it complicates the configuration of each slave device 26: it makes each slave device 26 more expensive and it results in increased power consumption at each slave device.
Another disadvantage of prior art systems is that attenuation of the clock signal amplitude is difficult to remedy. Adding buffers or making multiple copies of the clock to reduce attenuation has the undesirable effect of distorting the clock signal phase. A related disadvantage of prior art systems is that the lengths of the clock signal traces coupled to a given slave device must usually match the length of the datapath between the master and the slave device, complicating system layout.
Yet another disadvantage of prior art systems is that the distribution of the clock signal in a differential format, as shown in FIG. 1B, leads to differences between the clock and data reception circuitry within the slave device. These differences tend to produce undesirable timing skew between the clock and data. Accordingly, prior art systems either suffer the timing skew caused by differential clock distribution or constrain the clock distribution circuitry to be the same as the data distribution circuitry, which usually necessitates a non-differential (single-ended) architecture.
Accordingly, it would be highly desirable to provide an alternate mechanism for improving the timing performance in a master-slave system.