As computer and other digital systems become more complex and more capable, methods and hardware to enhance the transfer of data between system components or elements continually evolve. Data to be transferred includes signals representing information, commands, or any other signals. System components or elements can include different functional hardware blocks on a single integrated circuit (IC), or on different ICs. The different ICs may or may not be on the same printed circuit board (PCB). System components typically include an input/output (I/O) interface specifically designed to receive data from other system components and to transmit data to other system components. Generally speaking, existing I/O interfaces can be categorized as either serial “links” or parallel “links”. Regardless of the type of I/O interface, data transfers data must be synchronized between system components for proper operation. Synchronization includes accounting for or compensating for several phenomena that potentially cause errors, including signal jitter and signal skew. The phenomena include differences between component clocks, and physical attributes of the data paths that create noise and affect the integrity of the transferred signal.
In modem high speed interfaces such as double data rate (DDR) and graphics DDR (GDDR) interfaces, some known problematic phenomena are even more pronounced than in slower interfaces. For example, modem interfaces that transmit many bits at a time can experience degradation in data integrity due to excessive rates of current change. Data transmission typically involves multiple-bit bytes. A byte can be 8 bits or 12 bits for example. If all of the bits toggle at the same time the data bus draws a relatively large amount of current. Toggling is also referred to as changing state or changing logic level. When there is a lot of current on the bus, it affects the rate of change of current of the bus, or di/dt. When the number of bits changing state on the bus at one time increases, the rate of change of the current (di/dt) increases accordingly. When di/dt increases, the signal integrity decreases. At higher frequencies this phenomenon is exaggerated.
One way to resolve this issue is to reduce di/dt. One way to do decrease di/dt is to let one portion of the bus toggle, but let the other portion remain at the same logic level. This is referred to as a Dynamic Bus Inversion (DBI) scheme. For example, if there are 8 bits on the bus, and it is determined that 5 of the 8 bits are to be toggled, instead of toggling the 5 to be toggled, the other 3 are toggled. The 5 to be toggled are not toggled. That is, if the majority of bits are scheduled to be toggled, the bus is inverted and the minority is toggled instead. A signal is sent to the receiver indicating the bus is being inverted. The receiver then can decode the 8 bits correctly when they are received. So if the majority is scheduled to toggle, bus is inverted and the minority is toggled instead.
Minimizing power consumption is also critical for modem systems. For a terminated bus, power usage is affected by the way bits are toggled. For example, if the bus is terminated high to the power supply, when the data stays low, there will be a direct current path from power to ground, and thus power is consumed. When the data stays high, there is no current, and thus no power is consumed. Therefore, to reduce power, it is better for more data bits to stay high than low for a terminated-high bus. Similarly, for terminated-low bus, it is better for more data bits to stay low than high from a power consumption point of view. Another usage of DBI is to take advantage of this fact. For example, if there are 8 bits on a terminated-high bus, and it is determined that 5 of the 8 bits are to stay low, instead of toggling those 5 bits to be low, the other 3 bits are toggled to be low. For a terminated-low bus, of course, the high and low bits are reversed. The 5 to be toggled are not toggled. Again, a signal is sent to the receiver indicating the bus is being inverted. The receiver then can decode the 8 bits correctly when they are received. Conventionally, there are an extra bit and pin called the DBI bit and DBI pin that are used to indicate to the receiver whether or not the bus is inverted. Conventionally, one DBI pin is required for each data byte. In a 64-bit interface, 8 DBI pins are required just for the DBI function.
In addition to the signal integrity degradation caused by excessive rates of current change, there are signal integrity problems associated with signal timing issues. A typical serial link embeds clock information within the data stream and extracts the clock information at the receiver using a clock recovery scheme. Such schemes are also known as per-line closed-loop timing. Guaranteeing transition density requires encoding the data, typically using 8B/10B codes. A disadvantage of this approach is that it adds bandwidth overhead and increases complexity, which hurts performance and increases cost.
A typical parallel link sends a clock signal, or strobe, with a group of N data signals (for example, N may be 8 in a double data rate dynamic random access memory (DDR DRAM)). Depending on the data rate and the level of sophistication required, one of the following “source-synchronous timing” methods is used: the receiver simply samples the data with the strobe directly if the strobe has already been shifted by half a bit time relative to the data sent by the transmitter; or if the strobe is aligned with the edge of the data sent by the transmitter, the receiver delays the strobe by the same fixed amount across the group of data to sample the data at the nominal center of the data eye, where the data eye can be thought of as a time period during which the data signal is most stable.
Each of the two foregoing parallel link approaches require very tight matching of the trace impedance and trace length across the group of data lines and the strobe line to achieve high data rates. To alleviate this, each bit receiver can delay the strobe by a different amount to place its own clock at the center of its own data eye. This is sometimes called per-bit de-skew. A disadvantage of this parallel scheme is that the strobe (which is usually sent across a circuit board and distributed to the group of data) is noisy, thus reducing the system timing budget. In addition, the receiver simply uses or delays the strobe, which adds jitter rather than filtering jitter. In some implementations, a strobe is transmitted for each data bit rather than for a group of bits, which increases pin counts and cost.