One issue facing memory device designers today is the problem associated with a clock signal being distributed throughout the integrated memory circuit die while maintaining minimal clock skew. Cock signals are used to control the times at which component operations occur in a digital circuit. Clock skew is the time difference between clock signal edges arriving at different areas (e.g., different components) of an integrated circuit. Minimizing clock skew is important because digital logic circuits within memory devices require precise clocking for proper operation of the entire device (e.g., the outputting of data). Ideally, synchronous memory devices should have clock signals arriving simultaneously to all circuit components within the device that operate on the same clock period and same clock edge. In practice, the delay through a clock signal path should not be more than the interval between one of the edges of the clock signal and a following edge of the clock signal. As such, if there is a substantial amount of clock skew within a device, some components may not receive an edge of a clock signal before other components receive a subsequent edge based on the frequency of the clock signal. This prevents some components from operating at designated times relative to other components, and can cause the device to malfunction. As such, clock skew may limit the maximum clock frequency that a device may operate at because the device needs to be designed to accommodate worst case clock skew and still operate properly.
One of the causes of clock skew within an integrated circuit is that the impedance, or resistance-capacitance (RC), of the traces that route the clock signal to different areas of the device generate a delay in the clock signal. Other contributors to clock skew are delays due to passing the clock signal through pads and input buffers and the loading of the various registers that are driven by the clock signal, as illustrated in FIG. 1A. The total clock signal input to data output (Dout) delay (TCD) for the exemplary path illustrated in FIG. 1A is equal to the input buffer delay+the clock (CLK) buffer delay+RC delay of the metal trace+output register delay+the output buffer delay. Some exemplary values of the various delays may be 0.5 nanoseconds (ns) for the input buffer delay, 0.5 ns for the clock buffer delay, 1.5 ns for the trace delay, 0.5 ns for the output register delay, and 1.5 ns for the output buffer delay resulting in a TCD of 4.5 ns.
The Dout of the memory integrated circuit may be provided to other components (e.g., component B) that are connected with the memory integrated circuit on one or more printed circuit boards (PCB), as illustrated in FIG. 1B. The memory integrated circuit and component B may both be timed to operate based on the same clock signal. However, components typically require a set-up time (Tsu) in order to operate properly. The setup time is the minimum time needed for Dout to be applied at the input of component B before component B is triggered by the clock signal to perform a designated operation. An exemplary setup time may be approximately 2 ns. As such, component B would need to receive Dout from the memory circuit at least 2 ns before a subsequent clock edge of the clock signal that triggers component B to perform its operation. Continuing the example above, if the memory integrated circuit has a TCD of 4.5 ns (e.g., in a read operation of the memory array), then Dout would be provided to component B after approximately 4.5 ns. If the clock signal has a frequency of, for example, 100 MHz (i.e., the time period between clock signal edges is 10 ns), then there would be sufficient time (10 ns−4.5 ns=5.5 ns) to allow for the setup (2 ns) of component B with the clock skew of 4.5 ns. However, if the frequency of the clock signal is increased to 200 MHz, as illustrated in FIG. 1C, then the time period between triggering clock edges is 5 ns. With a TCD of 4.5 ns, then Dout would not be provided to component B in enough time (short by a delta of 1.5 ns) to allow for the 2 ns setup time for component B before a subsequent edge of the 200 MHz clock signal triggers component B to operate using Dout. Thus, a subsequent clock edge would be used to clock component B. This will decrease the overall throughput of the system incorporating components A and B. With devices operating at increasing frequencies, clock skew poses an increasing problem. Moreover, as the level of integration in a memory device increases, clock skew due to the above noted contributing factors becomes even greater.
FIG. 2A illustrates prior art solutions to clock skew in synchronous random access memory (SRAM) devices. One solution is to utilize a phase locked loop (PLL) to generate an internal clock signal that is synchronized with the original reference clock signal and then use the internal clock signal to drive output registers of the SRAM. A typical PLL contains a voltage controller oscillator (VCO) to generate the internal clock signal having a fixed amount of delay with respect to the reference clock signal. The PLL also contains a phase detector to measure the phase difference between the reference clock signal and the internal clock signal. The measured difference drives a charge pump to raise and lower the voltage level of a loop filter. The loop filter provides a stable voltage input to the VCO. Because the frequency of reference clock signal may vary over time, these differences are provided back to the phase detector and used to lock the frequency of the internal clock signal to the reference clock signal. If, for example, the frequency of the reference clock signal shifts slightly, the phase difference between the VCO signal and reference clock signal will begin to increase with time. This changes the control voltage on the VCO in such a way as to bring the VCO frequency of the internal clock signal back to the value of the reference clock signal. Thus, the loop maintains lock when the reference clock signal frequency varies.
The resulting internal clock signal is phase shifted from the reference clock signal such that both clock signals have the same frequency but the triggering edges of the internal clock signal is delayed with respect to the triggering edges of the reference clock signal. The internal clock signal is generated during power-up of the SRAM before any circuit operations are performed. As such, although the internal clock signal is delayed from the reference clock signal, the output registers may trigger off of a later clock edge of the free running internal clock signal that exists earlier in time than the skewed reference clock signal edge, as illustrated in FIG. 2B. In this manner, the clock signal input to data output delay associated with the path illustrated in FIG. 1A is reduced.
For high speed designs running at clock frequencies greater than, for example, 200 megahertz (MHz) (cycle time of 5 ns), TCD parameter may need to be very small (e.g., on the order of 1-2 ns). One solution for minimizing this parameter is the use a delay locked loop (DLL) to synchronize an internally generated clock signal with the reference clock signal and use the internal clock signal to drive output registers of the SRAM. A typical DLL includes a phase detector that measures the phase difference between the reference clock signal and the internally generated clock signal. The phase detector drives a shift register that causes stored data to shift positions based on the difference in signals. The shift register is coupled to a delay line to produce a phase-adjusted clock signal by sequentially delaying the internal clock signal according to the shift register data. The internal clock signal is fed back to the phase detector for comparison with the reference clock signal. As with the PLL, when the reference clock signal and the internal clock signal are the same, the DLL is locked onto the reference clock signal. As such, a feedback relationship is used to generate and maintain the internal clock signal with both the PLL and the DLL.
One problem with using a PLL is that the phase detector, loop filter, and VCO are typically analog components that have poor stability and performance in noisy digital switching environments. Similar problems may exist with the components used in a DLL. As such, it may not be desirable to use a PLL or DLL in content addressable memory devices that are typically more noisy than SRAM devices due to simultaneously comparing data with many CAM cells in the CAM array. In addition, the analog components used in a PLL/DLL utilize separate power and ground supplies that typically use higher voltages (e.g., 2.5V-3.5V) than digital components (e.g., 1.2V). Moreover, PLL and DLL components may only be able to operate in a fixed frequency range, thereby limiting their versatility.