1. Field of the Invention
The present invention generally relates to on-chip interconnect and more specifically to redundancy to satisfy on-chip interconnect timing.
2. Description of the Related Art
A source-synchronous, complementary metal-oxide-semiconductor (CMOS)-repeater-based interconnect provides a simple, high-performance topology for global on-chip communication fabrics. However as silicon die sizes increase, the on-chip interconnect may span 10 mm or more in length and the communication channels are subject to many sources of timing error including crosstalk, power-supply-induced jitter (PSIJ), and wire delay variation due to transistor and wire metallization mismatch.
For a 10-mm lower-level metal wire with 130 um width and space, 50% utilization on adjacent layers, and with repeater size and spacing optimized for the minimum power-delay product, the 1-σ delay variation is about 8 ps per transition polarity per wire due to transistor variation (slow process, 0.75V, and 125 degrees Celsius). If a “lone 1” is transmitted across such a wire, the leading and trailing signal transitions may each exhibit independent timing offsets normally distributed about a mean delay with σ=8 ps. This is equivalent to 1-σ values of 2.3% duty-cycle distortion (DCD) for a 4-Gb/s toggle (or 2-GHz double-data-rate clock) and 5.7 ps skew (i.e. net delay offset in the central point between the two edges). An example on-chip network is composed of one hundred 10-mm channels, each 10 bytes wide and operating at 4 Gb/s per wire (i.e. delivering a total of 4 TB/s over 10-mm). Assuming crosstalk, PSIJ, and random jitter (extrapolated to the bit error rate of interest) amount to 0.44UI (110 ps), and flip-flop tolerances and clock buffer skews amount to 0.2UI (50 ps), a statistical timing budget predicts a yield of 0% for the assembly of links comprising the on-chip network due to wire delay mismatch. In other words, with a yield of 0% no chips including such an on-chip network would function properly at full speed.
Crosstalk mitigation methods developed for source-synchronous, CMOS-repeater-based interconnect topologies can limit resulting timing jitter to about 200 milli unit interval (mUI) at aggressive bandwidth densities (e.g. on the order of 30 Tb/s per mm of bus width at the 28-nm process node). Power supply noise on the order of +/−7% can result in significant modulation of data rate (through modulation of signal propagation velocity), further reducing the effective timing margin by as much as 400 mUI. In such harsh environments, wire delay mismatch can cause chips to fail to operate properly, as explained above regarding the transmission of the “lone 1”, resulting in severe yield loss. The combination of wire delay mismatches, timing jitter, and power supply noise may reduce the effective timing margin such that clock frequency must be reduced to ensure that timing margin constraints are met so that the chip operates properly. In particular, the chips may fail when an on-chip source-synchronous, CMOS-repeater-based interconnect serves as the building block for large on-chip networks responsible for moving several terabytes of data per second across large portions of the chip. Failure of even a single signal transmitted on the wire of the interconnect to satisfy the timing requirements will likely result in a functional failure of the chip.
Accordingly, what is needed in the art is an improved technique for satisfying timing requirements of on-chip source-synchronous, CMOS-repeater-based interconnect.