The field of semiconductor memory devices is enormously active and rapidly developing. Various categories and sub-categories of semiconductor devices are known and commercially available. The ever-increasing popularity and ubiquity of computers and computer-based devices, both in the consumer and industrial realms, is such that the demand for semiconductor memory devices of a variety of different types will continue to grow for the foreseeable future.
One of the more common categories of semiconductor memory devices used today is the dynamic random access memory, or DRAM. Among the desirable characteristics of any DRAM are a high storage capacity per unit area of semiconductor die area, fast access speeds, low power consumption, and low cost.
One approach that has been used to optimize the desirable properties of DRAM has been to design such devices such that they are accessible synchronously. A synchronous DRAM typically requires an externally-applied clocking signal, as well as other externally-applied control signals whose timing must bear certain predetermined relationships with the clock signal. Likewise, digital data is read from and written to a synchronous memory device in a synchronous relationship to the externally-applied clock signal. Synchronous DRAM technologies have been under development for many years, and synchronous DRAM (frequently referred to as “SDRAM”) is used in a broad spectrum of commercial and industrial applications, including the personal computer industry.
In typical implementations, the external clock signal CLK for a clocked semiconductor device (memory device or otherwise) comprises a simple, periodic “square” wave, oscillating with reasonably uniform periodicity between a logical high voltage level (for example, 3.3V) and a logical low level (typically 0V) with a duty cycle of 50% (meaning that the signal is at a logical “high” level the same amount of time that it is at a logical “low” level during each complete clock cycle). In present state-of-the-art semiconductor devices, the clock signal may have a frequency on the order of hundreds of megahertz.
A synchronous (clocked) semiconductor device typically requires an external clock signal to be applied to a clock (CLK) input. In the prior art, it is most commonly the case that the inputted CLK signal is provided to an internal, central clock generator circuit, which in turn generates one or more internal clock signals that are routed to various functional blocks located at various locations on the semiconductor substrate.
One consideration that must be addressed relative to the routing of clock signals throughout a synchronous semiconductor device is the well-know problem of clock skew. See, for example, K. Yip, “Clock Tree Distribution: Balance is Essential for a Deep Sub-Micron ASIC Design to Flourish,” IEEE Potentials, vol. 16, no. 2, pp. 11–14, April–May 1997. Those of ordinary skill in the art will understand that the issue of clock skew relates to the inevitable variations in the arrival time of clock signals at different locations on a semiconductor substrate, where the master clock signal is generated at a single location on the part. Such variations arise due to such factors as propagation delays, variations in device switching speeds, and the like. Clock skew is a function of two main parameters: the loading presented to the logic being clocked and the RC (resistive-capacitive) delay of the clock-line interconnect. Interconnect factors that affect clock skew include the resistance, capacitance, and inductance of the interconnecting conductors (“wires”), which typically comprise metal or otherwise conductive traces formed as part of the semiconductor fabrication process.
Conductive wires in an integrated circuit (IC) are not ideal conductors, and differing lengths of wires in an IC can result in different delays in the propagation of clock signals throughout the IC. As is widely recognized, clock skew adds to the effective clock cycle time for a given semiconductor device, and hence adversely affects the performance of the device.
For circuit designers, a typical rule-of-thumb regarding clock skew is that it should be limited to ten percent or less of a chip's clock cycle, meaning that for a 100 Mhz clock, skew must be 1 nsec or less for each clock signal in the device. High-performance semiconductor devices may require that the clock skew be limited to 5% or less of the clock cycle; for a 500 Mhz clock this would require skew to be 100 picoseconds or less.
As noted, prior art designs have concentrated on having one central clock generator from which clock signals are distributed to functional blocks throughout the devices. Different strategies have been employed for clock signal distribution to avoid adverse consequences of clock skew. Perhaps the most common approach is to lay out the device such that all clock connections are symmetrical and of the same length. An example of this is the “H-tree” clock distribution scheme shown in FIG. 1a, wherein the conductive trace 12 representing the conductive traces for a clock signal are shown on a semiconductor substrate 10. An H-tree clock distribution strategy is used mostly in custom layouts and can further involve varying tree interconnect segment widths to balance skew throughout the IC.
Another known clock distribution strategy is the grid layout, such as the clock signal conductor 14 shown on substrate 10 in FIG. 1b. A clock grid is perhaps the simplest clock distribution scheme and has the advantage of being relatively easy to design for low skew. However, as those of ordinary skill in the art will appreciate, a clock grid is highly inefficient in terms of occupation of area on the substrate 10 and very “power hungry” due to the large number of clock interconnects required. Nevertheless, some manufacturers do use this approach, in particular, for microprocessors.
For high-performance semiconductor devices, a balanced tree distribution network 16 is often employed. For a balanced tree without buffers, the clock line capacitance increases exponentially starting from the leaf cell (i.e., a clocked element, an exemplary one being designated with reference numeral 18 in FIG. 1c) and moving toward the clock input at the root 20 of the tree. This extra capacitance results from the increasingly wider metal traces needed to carry current to the branching segments. The extra metal required further results in additional chip area to accommodate the extra clock-line width.
Buffers may be added at the branching points of a balanced tree clock network, and this can have the effect of significantly reduced clock interconnect capacitance, since it reduces clock-line width toward the root. An example of a tree clock network 22 with buffers 24 is shown in FIG. 1d. However, the buffers 24 undesirably occupy additional area on the substrate and increase circuit complexity.
Those of ordinary skill in the art will appreciate that factors that contribute to clock skew include loading mismatch at the clocked elements, mismatch in RC delay due to variations in segment width and segment length among the clock line segments, and process variations induced during chip fabrication. Inductance effects start to appear as clock-edge times and interconnect resistance decrease, both of which occurring more often with shrinking chip technology and higher clock rates. Clock trees often require wide traces at their roots and may also have long segments, making the trees more susceptible to inductance problems than other clock network schemes.
Careful layout, including placing power and ground lines next to, above, or below clock trees to act as shields, and can help reduce the possibility of clock problems caused by inductance. Many designers and parasitic extraction/evaluation tools presently available address only RC parasitic effects. IC designers have heretofore not commonly considered parasitic inductances, although this is more frequently considered as clock frequencies on state-of-the-art semiconductors are approaching (or exceeding) 1 Ghz.
A different approach to addressing the problem with clock skew is referred to as “delay-locked loop” or “DLL.” Various examples of DLL implementations for synchronous memory devices are proposed in U.S. Pat. No. 5,920,518 to Harrison et al., entitled “Synchronous Clock Generator Including Delay-Locked Loop;” U.S. Pat. No. 6,201,424B1 to Harrison, entitled “Synchronous Clock Generator including a Delay-Locked Loop Signal-Loss Detector;” and U.S. Pat. No. 6,130,856 to McLaury, entitled “Method and Apparatus for Multiple Latency Synchronous Dynamic Random Access Memory.” The aforementioned '518, '424, and '856 patents are each commonly assigned to the Assignee of the present invention and each are hereby incorporated by reference herein in their respective entireties.
The function of a DLL circuit in a semiconductor device is to adjust the relative timing of clock signals provided to functional elements disposed at various locations on a semiconductor die such that overall synchronous operation of the device can be achieved. DLL implementations may utilize some type of loop-back operation whereby the DLL circuit is provided with feedback for comparing the timing of clock signals provided on various lines and provided to various functional elements of the device. As a result of the functionality of a typical DLL circuit, if the propagation and loading characteristice of one clock signal transmission line vary significantly from others, the DLL circuit can account for such differences in order to ensure that proper device operation can be maintained. Separate delays and skews (programmable or automatically adjusted) may be introduced into the externally-applied clock signal to ensure that each of the functional blocks in the device receives a clock signal that is substantially synchronized with the others. Such delays and skews may be miniscule, on the order of picoseconds, but may be nonetheless critical to proper operation of a semiconductor device.
Another approach to addressing the problem of clock skew involves using low-impedance lines with matched terminations and current mode signaling to achieve well-defined, and hence more readily compensated-for, delays. See, for example, T. Knight et al., “Method for Skew-Free Distribution of Digital Signals Using Matched Variable Delay Lines,” Symposium on VLSI Circuits, Kyoto, Japan, May 19–21, pp. 19–20 (1993); see also, S. I. Liu et al., “Low Power Clock Deskew Buffer for High Speed Digital Circuits,” IEEE Journal of Solid-State Circuits, v. 34, no. 4, pp. 554–558 (1999).
Despite the various approaches proposed in the prior art, clock skew remains an ongoing challenge to integrated circuit designers, and it is believed that it is becoming increasingly important to address the issue of clock skew as device geometries shrink and system clock speeds rise.