Several of today's computer system architectures employ a source strobed bus and method to transfer data between devices. In a typical source strobe architecture, the transmitting device transmits to the receiving device a clock signal/strobe and data. The strobe alerts the receiving device that valid data has been transmitted over the bus. This is typically referred to as a source strobe or “clock forwarding” event. Computer bus architectures such as AGP (accelerated graphics port), DDR SDRAM (double data rate synchronous dynamic random access memory), and RDRAM (Rambus random access memory) utilize source strobes in this manner.
Source strobe techniques allow data to be transmitted at higher speeds because the flight time and distribution delays of the clock signal and the data are matched. Often times, data is transferred on both rising and falling edges of the strobe. Source strobe techniques, however, require extraordinary care in matching the delays of the data and source clock signals, as well as minimizing the asymmetry of the source strobe itself (i.e., the differences in delays between the rising and falling edges of the strobe). In a typical source strobed bus, both rising and falling edges of the strobe are used to clock data., but there is a difference in the rising and falling edge delays caused by intrinsic (delay through a component) and extrinsic (delay caused by loading on the component output) delays of the system.
The intrinsic delay can typically be minimized, but the extrinsic delay is a factor of how many loads are being driven and the wire lengths of the loads. The extrinsic delay is basically a non-linear RC (resistance times capacitance) curve making the extrinsic delay a “wild card” in attempting to balance the delays. The on-die wire lengths must be managed and the number of loads must be equalized to minimize the asymmetry of the strobes. This can be illustrated with the following example. Let a strobe pulse have a period of 5 nano-seconds (nsecs). In a perfect system, the 5 nsec period would yield a pulse with a 2.5 nsec high and a 2.5 nsec low. Unfortunately, the intrinsic delays are different when driving from a high to a low, than they are when driving from a low to a high. The extrinsic delays are also different. Consequently, the ideal 5 nsec pulse may actually be 3 nsec high and 2 nsec low. The time lost due to this asymmetry cuts into the extremely tight timing specifications of the source strobed bus and thus, must be minimized.
Typically, the core logic of the receiving device does not interface directly with the source strobed bus. Often times, the logic necessary to capture data from the bus is carefully placed in what is commonly referred to as an I/O (input/output) or data macro. The I/O macro is replicated many times along the edge of the die of the receiving device's integrated circuit (IC). Special care is taken to distribute the source strobe to each of the I/O macros in a manner that substantially guarantees a minimum skew and asymmetry of the source clock strobe so that the strobe may be aligned within a specific data eye of the transmitted data. Typically, once the data has been captured in the I/O macros, the data is transferred into another clock domain by moving the data to the core logic of the receiving devices. The core logic clock domain has substantially less stringent timing requirements than the source strobe clock domain because the core logic clock typically operates at a slower rate than the source strobe clock.
Some of today's source strobed bus architectures such as e.g., DDR and RDRAM use a bus protocol in which each device connected to the bus agrees on when a strobe event occurs and how many events will occur. The information concerning the timing and number of events are passed between the devices through signals separate from the tightly controlled source strobed data path. In other architectures such as e.g., AGP, some source strobe events are isochronous in nature (i.e., the event may occur at unknown times). These architectures must rely on one or more flip-flops that toggle with each strobe event. The flip-flops are sampled within the less stringent clock domain to see if a strobe event occurred. Both of these architectures and protocols, however, experience the following problems that adversely impact the skew and asymmetry of the source strobes.
When distributed internal to an IC, the strobe delays must closely match the data delays. The strobe is distributed to capture data in flips-flops within the I/O macros. When the strobes are used outside the data path to toggle other non-data related flip-flops, the IC must be designed to either: (1) maintain the uniformity of the I/O macros by including toggle flip-flops in each macro; or (2) place toggle flip-flops are outside the tightly controlled I/O macro. The first choice adds substantially more load to every strobe and thus, adversely impacts the strobe delay and asymmetry. The second choice forces the IC designer to use the strobe clock outside the well controlled I/O macros in order to toggle a single flip-flop. This induces large uncontrolled wire delays on the strobe distribution, which cuts into the budget allotted for skew and asymmetry.
Thus, there is a desire and need for a technique to detect a source strobe event in the less stringent clock domain in a manner that will not adversely impact the skew and asymmetry of the internally distributed source strobe.