Resonant clock distribution networks have recently been proposed for the energy-efficient distribution of clock signals in synchronous digital systems. In these networks, energy-efficient operation is achieved using one or more inductors to resonate the parasitic capacitance of the clock distribution network. Clock distribution with extremely low jitter is achieved through reduction in the number of clock buffers. Moreover, extremely low skew is achieved among the distributed clock signals through the design of relatively symmetric all-metal distribution networks. Overall network performance depends on operating speed and total network inductance, resistance, size, and topology, with lower-resistance symmetric networks resulting in lower jitter, skew, and energy consumption when designed with adequate inductance.
Without the inductive elements of resonant clock distribution networks, conventional (i.e., non-resonant) clock distribution networks, rely almost exclusively on collections of buffers for distributing a reference clock signal to the multiple clocked elements, such as flip-flops and clock gaters, of a semiconductor device. In conventional clock distribution networks, the buffers are generally arranged in a topology that allows the reference clock signal to be supplied at a single root-point of the network and then propagated throughout the device through a sequence of buffer elements. The total propagation delays of the buffers along any given path from the root to some clocked element are generally balanced in some fashion, so that, for example, the clock signal arrives at all the various elements at approximately the same time. The propagation delays of individual buffers depend on a variety of factors, including the sizes of the transistors used to implement the buffers, the capacitive loads associated with the wiring used to interconnect the different buffers in the network, the temperature and voltage the buffers are operated at, and the specific characteristics of the various device materials that are actually realized during the manufacturing process.
The total propagation delay of the buffers along any given path from the root to some clocked element is also referred to as the insertion delay of the path, and the overall insertion delay profile of the overall clock network is one of the network's most important characteristics. The worst-case difference between the insertion delays of any two clocked elements in a semiconductor design is referred to as the clock skew between the devices. In general, increased clock skew is a hindrance to overall device performance, as large skews imply that new outputs of clocked-elements may become available later than anticipated, and inputs to clocked-elements may be needed earlier than anticipated, leading to an overall reduction in the amount of time that is available for the operation of the digital logic during each clock period.
As previously alluded to, variations in manufacturing parameters or operating conditions affect buffer propagation delays, and hence, the insertion delays of both paths and the overall clock distribution network. For example, process variations during manufacturing can result in faster or slower transistor switches, thus resulting in shorter or longer insertion delays, respectively. Furthermore, variations in the supply voltage or temperature during operation can affect insertion delays. To exacerbate the situation, these variations are “dynamic” in the sense that even a specific sample of a device will in the field be subject to voltages and temperatures that will vary from one instant to the next. These dynamic variations increase delay uncertainty, and subsequently reduce the level of performance that a device can be guaranteed to achieve under all anticipated operating conditions. In general, the magnitude of insertion delay variations is proportional to their target values. Therefore, clock distribution networks with relatively long insertion delays tend to have wider variations in their insertion delays than clock distribution networks with relatively short insertion delays.
In resonant clock distribution networks, insertion delays are typically in the order of a few tens of picoseconds, as these networks tend to have very low resistance, and tend to include only a few buffers. By contrast, conventional clock distribution networks typically include a large number of buffers and can exhibit insertion delays in the order of hundreds of picoseconds. Consequently, in the presence of variations in process parameters, voltage, and temperature, conventional clock distributions networks tend to have a relatively larger variation in insertion delay than resonant clock networks.
When resonant and conventional clock distribution networks are used in the same design, the difference in the insertion delays of the two networks can result in relatively large clock skews that can be detrimental to overall device performance. Typically, in such a design, it is possible to use automatic delay tuning blocks to compensate for the difference in the insertion delays of the two clock distribution networks, but due to the increased variability of advanced manufacturing processes, the range of insertion-delay mismatches can be significant, even to the point of being comparable to the longest insertion delays in the conventional clock distribution network. The overheads of automatic delay tuning blocks with such large tuning ranges can thus be significant, and even the design of a delay tuning block with such a large tuning range can be particularly challenging.
Another challenge with designs that include resonant and conventional clock distribution networks is the rate of variation in the clock skew between the two networks in the presence of dynamic variations during operation. Such variations may affect insertion delay within a clock cycle of operation. Moreover, their impact may vary significantly from cycle to cycle. Automatic delay tuning blocks are typically unable to react to such quick changes. Therefore, if the changes in the insertion delay of the resonant clock does not track the changes in the insertion delay of the conventional network, this difference is manifested as additional delay uncertainty that has a detrimental impact on overall device performance.
Architectures for resonant clock distribution networks have been described and empirically evaluated in the several articles, including: “A 225 MHz Resonant Clocked ASIC Chip,” by Ziesler C., et al., International Symposium on Low-Power Electronic Design, August 2003; “Energy Recovery Clocking Scheme and Flip-Flops for Ultra_Low-Energy Applications,” by Cooke, M., et al., International Symposium on Low-Power Electronic Design, August 2003; “Resonant Clocking Using Distributed Parasitic Capacitance,” by Drake, A., et al., Journal of Solid-State Circuits, Vol. 39, No. 9, September 2004; “A 1.1 GHz Charge Recovery Logic,” by Sathe V., et al., International Solid-State Circuits Conference, February 2006; “900 MHz to 1.2 GHz two-phase resonant clock network with programmable driver and loading,” by Chueh J.-Y., et al., IEEE 2006 Custom Integrated Circuits Conference, September 2006; “A 0.8-1.2 GHz frequency tunable single-phase resonant-clocked FIR filter,” by Sathe V., et al., IEEE 2007 Custom Integrated Circuits Conference, September 2007; and “A Resonant Global Clock Distribution for the Cell Broadband Engine Processor,” by Chan S., et al., IEEE Journal of Solid State Circuits, Vol. 44, No. 1, January 2009. None of these articles describes any methods for using resonant and conventional clock distribution networks in the same design.
A design with resonant and conventional clock distribution networks was described in “A Resonant-Clock 200 MHz ARM926EJ-S Microcontroller,” by Ishii A., et al., European Solid-State Circuits Conference, September 2009. The design in that article used a programmable delay block to adjust the insertion delay of the reference clock that drives the resonant clock driver. That delay block was programmed by control signals external to the chip. Therefore, in that design, the resonant clock network was not capable of tracking the conventional clock distribution network in the presence of dynamic variations.
Methods for controlling the skew between a resonant clock network and a second clock network are described in US Pat. Appl. No. 20080150605 by Chueh J.-Y., et al. Those approaches rely on the use of digitally-controlled delay blocks to automatically adjust the delays of the reference clocks by monitoring the skew between clock signals in the two clock networks. This monitoring is performed over time using an integration function. It is thus unsuitable for providing quick adjustments on a cycle-by-cycle basis.
Overall, the examples herein of some prior or related systems and their associated limitations are intended to be illustrative and not exclusive. Other limitations of existing or prior systems will become apparent to those of skill in the art upon reading the following Detailed Description.