Clock signals are used by a wide variety of digital circuits to control the timing of various events occurring during the operation of the digital circuits. For example, clock signals are used to designate when command signals, data signals, and other signals used in memory devices and other computer components are valid and can thus be used to control the operation of the memory device or computer system. For example, a clock signal can be used to latch the command, data, or other signals so that they can be used after the command, data, or other signals are no longer valid.
The problem of accurately controlling the timing of clock signals for high speed digital circuits is exemplified by clock signals used in high speed dynamic random access memories (“DRAMs”), although the problem is, of course, also applicable to other digital circuits. Initially, DRAMs were asynchronous and thus did not operate at the speed of an external clock. However, since asynchronous DRAMs often operated significantly slower than the clock frequency of processors that interfaced with the DRAM, “wait states” were often required to halt the processor until the DRAM had completed a memory transfer. The operating speed of asynchronous DRAMs was successfully increased through such innovations as burst and page mode DRAMs, which did not require that an address be provided to the DRAM for each memory access. More recently, synchronous dynamic random access memories (“SDRAMs”) have been developed to allow the pipelined transfer of data at the clock speed of the motherboard. However, even SDRAMs are incapable of operating at the clock speed of currently available processors. Thus, SDRAMs cannot be connected directly to the processor bus, but instead must interface with the processor bus through a memory controller, bus bridge, or similar device. The disparity between the operating speed of the processor and the operating speed of SDRAMs continues to limit the speed at which processors may complete operations requiring access to system memory.
A solution to this operating speed disparity has been proposed in the form of a computer architecture known as “SyncLink.” In the SyncLink architecture, the system memory may be coupled to the processor directly through the processor bus, although it may continue to be coupled to the processor through a memory controller or other device. Rather than requiring that separate address and control signals be provided to the system memory, SyncLink memory devices receive command packets that include both control and address information. The SyncLink memory device then outputs or receives data on a data bus that may be coupled directly to the data bus portion of the processor bus.
An example of a packetized memory device using the SyncLink architecture is shown in FIG. 1. The SyncLink memory device 10 includes a clock generator circuit 40 that receives a command clock signal CMDCLK on line 42 and generates a large number of other clock and timing signals to control the timing of various operations in the memory device 10. Three of these clock signals are a command latch clock ICLK, a read data clock signal RCLK, and a write data clock signal WCLK, all of which are used in a manner described below. The memory device 10 also includes a command buffer 46 and an address capture circuit 48 which receive the internal clock signal ICLK, a command packet CA0-CA9 on a command bus 50, and a flag signal F on line 52. As explained above, the command packet contains control and address data for each memory transfer, and the flag signal F identifies the start of a command packet, which may include more than one 10-bit packet word. In fact, a command packet is generally in the form of a sequence of four 10-bit packet words on the 10-bit command bus 50. The command buffer 46 receives the command packet from the bus 50, and compares at least a portion of the packet words to identifying data unique to the memory device to determine if the command packet is being directed to that memory device rather than another memory device or some other device in a computer system. If the command buffer 46 determines that the command packet is directed to the memory device 10, it then provides a command word corresponding to the packet words to a command decoder and sequencer 60. The command decoder and sequencer 60 generates a large number of internal control signals to control the operation of the memory device 10 during a memory transfer.
The address capture circuit 48 also receives the packet words from the command bus 50 and outputs a 20-bit address corresponding to the address data in the command. The address is provided to an address sequencer 64 which generates a corresponding 3-bit bank address on bus 66, a 10-bit row address on bus 68, and a 7-bit column address on bus 70.
One of the problems of conventional DRAMs is their relatively low speed resulting from the time required to precharge and equilibrate circuitry in the DRAM array. The packetized memory device 10 shown in FIG. 1 largely avoids this problem by using several memory banks 80, in this case eight memory banks 80a-h. After a memory read from one bank 80a, the bank 80a can be precharged while the remaining banks 80b-h are being accessed. Each of the memory banks 80a-h receive a row address from a respective row latch/decoder/driver 82a-h. All of the row latch/decoder/drivers 82a-h receive the same row address from a predecoder 84 which, in turn, receives a row address from either a row address register 86 or a refresh counter 88 as determined by a multiplexer 90. However, only one of the row latch/decoder/drivers 82a-h is active at any one time. Bank control logic 94 selects one of the row latch/decoder/drivers 82a-h to be active as a function of bank data from a bank address register 96.
The column address on bus 70 is applied to a column latch/decoder 100 which, in turn, supplies I/O gating signals to an I/O gating circuit 102. The I/O gating circuit 102 interfaces with columns of the memory banks 80a-h through sense amplifiers 104. Data is coupled to or from the memory banks 80a-h through the sense amplifiers 104 and I/O gating circuit 102 to a data path subsystem 108 which includes a read data path 110 and a write data path 112. The read data path 110 includes a read latch 120 receiving and storing data from the I/O gating circuit 102. In the memory device 10 shown in FIG. 1, 64bits of data are applied to and stored in the read latch 120. The read latch then provides four 16-bit data words to a multiplexer 122. The multiplexer 122 sequentially applies each of the 16-bit data words to a read FIFO buffer 124. Successive 16-bit data words are clocked into the FIFO buffer 124 by the read clock signal RCLK generated from the command clock CMDCLK by the clock generator circuit 40. The FIFO buffer 124 sequentially applies the 16-bit words and two clock signals (a clock signal and a quadrature clock signal) to a driver circuit 128 responsive to the read clock signal RCLK. The driver circuit 128, in turn, applies the 16-bit data words to a data bus 130. The driver circuit 128 also applies a data clock signals DCLK to a clock bus 132 so that a device, such as a processor, reading the data on the data bus 130 can be synchronized with the data.
The write data path 112 includes a receiver buffer 140 coupled to the data bus 130. The receiver buffer 140 sequentially applies 16-bit words from the data bus 130 to four input registers 142, each of which is selectively enabled by a signal from a clock generator circuit 144. The clock generator circuit 144 receives a data clock signal DCLK generated by an external device applying the data to the data bus 130 of the memory device. Since the data clock DCLK is synchronized to the data applied to the data bus 130, the input registers 142 are enabled at the proper time when write data are present on the data bus 130. Thus, the input registers 142 sequentially store four 16-bit data words and combine them into one 64-bit data word applied to a write FIFO buffer 148. The write FIFO buffer 148 is clocked by the write data clock signal WCLK generated from the command clock CMDCLK by the clock generator circuit 40 and a signal from the clock generator 144. The write FIFO buffer 148 then sequentially applies 64-bit write data to a write latch and driver 150. The write latch and driver 150 applies the 64-bit write data to one of the memory banks 80a-h through the I/O gating circuit 102 and the sense amplifier 104.
As mentioned above, an important goal of the SyncLink architecture is to allow data transfer between a processor and a memory device to occur at a significantly faster rate. However, the operating rate of a packetized DRAM, including the SyncLink memory device 10 shown in FIG. 1, is limited by the need to maintain internal synchronism in the packetized DRAM. More specifically, as the operating speed of a packetized DRAM increases, it becomes more difficult to ensure that various signals are present at circuit nodes at the proper time relative to other signals. One of the limiting factors in the speed at which the memory device 10 can operate is the difficulty in controlling the relative timing between the various signals in the memory device. In particular, the amount of the delay of signals in the memory device is highly variable, and the delay is difficult to control. If, for example, the delay of the internal clock signal ICLK cannot be precisely controlled, it may cause a latch in the command buffer 48 to latch invalid packet words. Thus, the speed at which command packets can be applied to the memory device 10 is limited by the delays in the memory device 10. Similar problems exist for other control signals in the memory device 10 which control the operation of the memory device 10 during each clock cycle.
The above-described problem has been largely alleviated by using a clock generator circuit 40 that is capable of making fine resolution adjustments of the phase of the internal clock signal ICLK relative to the command clock CKCMD. An example of a clock generator circuit 40 having these capabilities is described in U.S. patent application Ser. No. 08/879,847 to Ronnie M. Harrison which is incorporated herein by reference. The clock generator circuit 40 described therein is able to adjust the phase of the internal clock signal ICLK relative to the time in which packet words are applied to a latch in the command buffer 46 in increments of significantly less that a single clock cycle, i.e., in increments of 11.25 degrees. As a result, the phase of the internal clock ICLK can be adjusted so that the packet words are clocked into the command buffer 46 at the proper time even at high operating speeds of the memory device 10.
Although the approach described in U. S. patent application Ser. No. 08/879,847 is capable of ensuring accurate synchronization between internal signals inside the memory device 10, it may not be capable of ensuring accurate timing of signals applied to and received from devices that are external to the memory device 10. For example, it may be difficult for the memory device to apply read data to a memory controller or other device at the proper time, particularly at high operating speeds. One of the reasons that the approach described by Harrison may not be able to synchronize these signals is that it may be necessary to adjust the timing of the signals over time periods that are far in excess of the range of adjustment that are possible with the Harrison approach. In particular, the Harrison approach is limited to phase adjustments over a range of 180 degrees, it may be necessary to adjust the phase of signals in the memory device 10 over ranges of many clock cycles, particularly at high operating speeds.
As mentioned above, the above-described problem may be particularly severe for coupling read data from the memory device 10 to external devices, such as a memory controller, because it is very difficult to predict and control the time required for data from the memory device to be coupled to and latched by the external device, particularly at high operating speeds. As the operating speed of the memory device increases, the time that the read data is applied to the external device must be controlled very precisely to resolutions of less than a clock cycle. Moreover, the time required for the read data to be applied to the external device can vary considerably, depending on several factors. Thus, the precise control of the time that the read data is applied to the external device must be accomplished over a range of several clock cycles. As the operating speed of memory device continues to increase, adjusting the time that the read data is applied to the external device with sufficient precision over a sufficiently wide range is increasingly difficult.
Although the foregoing discussion is directed to the need to precisely control the timing of read data applied to an external device, similar problems exist for other signals in packetized memory devices and for the same or other signals in other memory devices, such as asynchronous DRAMs and synchronous DRAMs, which must process control and other signals at a high rate of speed. For example, the data clock DCLK must also be applied to an external device, such as a memory controller, at a precise time over a wide range. Thus, there is a need to precisely control the timing of clock signals relative to other signals over a wide range in packetized DRAMs and other circuits.