Packaged systems incorporating multiple die are receiving growing interest. Multi-die packages use die-to-die links to enable communication between die. A die-to-die link must typically support very large aggregate data bandwidth and favors a parallel bus architecture with a forwarded clock for simpler data retiming at the receiver.
One conventional technique to provide die-to-die clock distribution is to employ a balanced clock tree on both die. A balanced clock tree is designed such that the distance from an external clock-input contact to a bit is the same from bit to bit within each die. In other words, the clock insertion delay is the same (or very close to the same) for each bit of the die.
An example of a balanced tree is an “H” tree, where the metal routes from the external contact to each of the bits form paths that look like a recursive hierarchy of the letter “H,” and the length from a given bit to the external contact is substantially uniform over all of the bits. In an example conventional system each bit corresponds to a flip-flop. The clock tree feeds the group of flip-flops, where each flip-flop either transmits a bit of data or captures a bit of data at a clock edge. When the tree is balanced, the clock insertion delays to each of the bits are uniform, and the flip-flops receive the clock edge at the same time. Such feature may be useful in a die that is intended for use in a multi-die package. In one example, the die has multiple bits that transmit data, and the balanced tree causes the bits to be transmitted at the same time. Such synchronized transmission of bits allows for the die to be paired with another die that expects to receive the bits. Thus, the die can be manufactured with no knowledge at the time as to which dies it will be packaged with, assuming the dies expect synchronized transmission, or reception, of bits.
However, balancing a clock tree may include using longer metal traces for some bits, thereby increasing the total amount of metal, parasitic capacitance to nearby metal, and dynamic power consumption in the clock tree as a whole. There is thus a need in the art for more power-efficient clock distribution in die-to-die clock interfaces.