In digital data systems in general and in computer systems in particular, there is an ever-increasing drive for larger bandwidth and higher performance. These systems are comprised of discreet integrated circuit chips that are interconnected by a bus. Data moves through a chip and between chips in response to clock pulses, which are intended to maintain synchronization of the data in parallel paths. At the extremely high data rates in today's systems, variations in the propagation of data over a bus along one path as compared to another path on the bus (i.e. skew) can exceed one clock cycle. U.S. Pat. No. 6,334,163, which is assigned to the assignee of this application and is incorporated herein by reference, discloses a so called Elastic Interface (EI) that can compensate for bus skew greater than one clock cycle without a performance penalty. Never the less, packaging technology has not been able scale up to match the performance and bandwidth of the chip and interface technologies. In order to reduce the number of I/O terminals on a chip and the number of conductive paths in a bus between chips, the prior art transfers data using a so called Double Data Rate (DDR), in which data is launched onto the bus at both the rising and falling edges of the clock. This allows the same amount of data to be transferred (i.e. bandwidth) with only half the number of bus conductors and half the number of I/O ports, as compared with a system where data is launched only on a rising or falling clock edge.
FIG. 1 illustrates a prior art system in which data is transferred from a chip to a bus via a double data rate interface. Here the clock synchronous data is comprised of one word from Bitstack 0 and one from Bitstack 1. The output of the each bitstack is coupled as an input to a bitstack register 10 comprised of master-slave latches 11. The output of one word of the register 10 is coupled as input to a multiplexer 12 whose output is coupled to a double data rate bus 14 via an I/O port on the chip. The output of the other word of the register 10 is coupled as an input to a master latch register 16 whose output is coupled as input to multiplexer 12. A select input to the multiplexer, operating at the local clock frequency, couples one half of the register 10 outputs to the I/O port for one clock edge and the other half of the register 10 outputs, through master latch register 16, to the I/O port on the next clock edge. That is, for example, one edge of the clock signal selects Bitstack 0 data to launch onto the bus, and the other edge selects corresponding data from Bitstack 1.
FIG. 2 is a timing diagram illustrating the timing used in the prior art implementation of this double data rate interface. In the prior art system design, time must be allowed for the data to set-up in the register prior to selection of any data by the double data rate select input to the multiplixer. That is, time for the elements that comprise the register to assume a stable state after the input changes. Here it should be noted that the prior art requires Bitstack 0 and Bitstack 1 data to set up in the register 10 prior to launching either Bitstack 0 or Bitstack 1 data from the register 10. Following this set-up time interval for the last to arrive data the multiplexer, in response to the select signal timed to one edge of the local clock signal, the multiplexer couples half the contents of the register (e.g. the data from Bitstack 0) to the bus via an I/O port and the next edge transfers the other half (e.g. the data from Bitstack 1). There is a possibility that the transmission delay in the path from one Bitstack will be longer than the delay in the path from the other Bitpath. In FIG. 1 this is illustrated by making the path from Bitstack 1 longer than the path from Bitstack 0. FIG. 2 shows data from both bit stacks (A0, B0, A1, B1 . . . Near End) launched on a rising edge of the C2 clock at the same time. Due to the relatively longer path of Bitstack 1, the simultaneously launched data A1, B1, C1, . . . is delayed with respect to A0, B0, C0 . . . at the register 10 (Far End). The solid black line represents a time when the Bitstack 0 data is setup in the register 10 and could be launched by a select following the falling edge of the C2 clock. However, the Bitstack 1 data is not yet set up in the register 10, and therefore the Bitstack 0 data is not launched until a half cycle later represented by the dotted line, and Bitstack 1 data another half cycle after that. The final waveform shows the launch times of the A0, A1, B0, B1, C0, C1 . . . data onto the double data rate bus. Notice the A0 data is not launched until after the time denoted by the dotted line.