1. Field of the Invention
The present invention relates to a synchronous data transfer circuit, a computer system and a memory system, for reading and transferring data from a circuit chip disposed on a substrate, and more particularly a synchronous data transfer circuit, a computer system and a memory system for transferring data at the same timing even a delay amount of the data from a circuit chip differs.
2. Description of the Related Art
With the development of the semiconductor technology and a chip mounting technique, there has been provided an apparatus on which a plurality of CPUs and a large capacity main memory device are mounted on a single substrate. For example, an apparatus termed blade server is listed. In such an apparatus, as a matter of mounting, it is difficult to dispose a plurality of modules (chips) with the same distance from other modules (chips). Accordingly, the time an IC chip requesting data (or IC chip of data request source) acquires data from an IC chip of which data is requested (or IC chip of data request target) disperses. This dispersion mainly depends on the line length and the performance of the IC chip.
With the improvement of data processing speed in recent years, the range of the above dispersion becomes hard to ignore. In order to reduce the dispersion, it becomes necessary to provide a data transfer circuit. For example, in a memory device, provision of a DLL (delay locked loop) in a register has been proposed, for example, in the official gazette of the Japanese Unexamined Patent Publication No. 2003-044350, and the official gazette of the Japanese Unexamined Patent Publication No. Hei-11-086545.
The above method using such clock control only can be applied only within an IC chip. However, since fine tuning is further required in an IC chip for transfer to another IC chip being connected, the above method is not applicable without modification.
FIG. 8 shows a block diagram of the conventional synchronous data transfer circuit; FIG. 9 shows a configuration diagram of the conventional delay circuit; and FIGS. 10, 11 show explanation diagrams of the conventional transfer operation. As shown in FIG. 8, a synchronous data transfer circuit (for example, a memory controller) 100 includes a clock control circuit 110 having a frequency dividing circuit 112 for frequency dividing a clock CLK0 of data request source; a read control circuit 120 for reading data from a chip 200 (here, memory) of data request source; and a data assembly circuit 130.
The clock CLK0 is issued to provide timing at which the data request side fetches data. The frequency dividing circuit 112 in the clock control circuit 110 frequency divides the clock CLK0 synchronously with the operation speed of the chip 200 of data request target, and transmits an operation clock CLK1 to the chip 200 of data request target.
In the chip 200 of data request target, in synchronization with the clock CLK1, a data strobe signal DQS [N:0] and data DQ [0]-DQ [N] are transmitted to the read control circuit 120 in response to a read request received. These data DQ [0]-DQ [N] are serial signals.
As shown in FIG. 10, in the read control circuit 120, the data strobe signal DQS [N:0] is input into a DQS control circuit 122. The read control circuit 120 fetches the data DQ [0]-[N] into flip-flop circuits (FF00-0N) 124-0 to 124-N, using the rise up of the DQS [0]-[N] as the clocks for FFs 124-0 to 124-N.
Meanwhile, in the data assembly circuit 130, data assembly timing is specified by the clock CLK0. Therefore, conventionally, it has been configured so that the output flip-flop circuits (FF10-1N) 128-0 128-N in the read control circuit 120 fetch (synchronize) data with the clock CLK0.
As shown in FIG. 10, since the phase of the data strobe signal DQS is not consistent with that of the clock CLK0, a delay amount determined by TAP [N:0] of delay circuits (DL (b0)) 126-0 to 126-N has been added to the outputs of FFs 124-0 to 124-N, and FFs 128-0 to 128-N is applied the clock CLK0, thereby synchronizing at the fetch timing of the data assembly circuit 130.
The data assembly circuit 130 fetches the outputs of the FFs 128-0 to 128-N into flip-flop circuits (FF20-2N) 132-0 to 132-N at the timing of the clock CLK0, and performs data assembly. As such, using the delay circuits 126-0 to 126-N, the synchronization of the data DQ has been achieved.
These delay circuits 126-0 to 126-N are constituted of eight paths having delay elements 140 of 1, 2, 3, 4, 5, 6, 7 and 8 stages, respectively, and a path selector 142, as shown in FIG. 9. In this figure, reference symbols are shown only for the path of delay elements 140 depicted by the triangles, being serially connected into 8 stages. To simplify the diagram, the reference symbols are omitted for other paths of the delay elements depicted by the triangles.
Each delay element 140 is constituted of, for example, transistor having an identical delay amount. Depending on a necessary delay amount, a path is selected by tap selection TAP0 [0] of the selector 142. The outputs of FFs 124-0 to 124-N are thus delayed for the selected delay amount (DLb0 shown in FIG. 10), and then input to FFs 128-0 to 128-N.
When the data DQ [0]-[N] are, for example, 4 bits (N=4) in parallel, the necessary number of these delay circuits 126-0 to 126-N becomes 4. Meanwhile, as shown in FIG. 8, when the chip 200 of data request target outputs a 64-bit parallel signal, dispersion in each signal is substantially large, which makes it difficult to cope with the dispersion by one data strobe signal DQ [N:0].
To solve this problem, as shown in FIG. 11, data strobe signals [N:0]-[N:15] are output with different phases, for example, on a basis of 4 bits. This necessitates the provision of the read control circuits 120 in the corresponding number, namely, 16 in an exemplary case of the aforementioned 64 bits in parallel. Correspondingly, the delay amount TAP of delay circuits 126-0 to 126-N is set. The data assembly circuit 130 synchronizes these 4-bit parallel signals, assembles into a 64-bit parallel signal and transfers.
Further, when a plurality (m) of the IC chips 200 of the request target are existent on a substrate, the signal delay amount for each IC chip 200 differs depending on the line lengths and the performance of the chips 200.
For example, also as shown in FIG. 11, when the phase of the data strobe signal DQS for each IC chip 200 differs from the phase of the signal DQS shown in FIG. 10, the delay amount becomes DLbm, which is different from the delay amount of the aforementioned delay circuit shown in FIG. 9. As a result, as shown in FIG. 8, the necessary number of the read control circuits 120 is 16×m, and further, the necessary number of the delay circuits shown in FIG. 9 is 4×16×m.
As such, in the prior art, it has been necessary to provide delay circuits of which number is determined corresponding to the number of data strobe signals and the data strobe signals corresponding to the number of parallel data. Since the delay circuits are configured so as to be adjustable to arbitrary delay amount individually, the wide range of the delay amount is required. This necessitates that a large number of delay elements have been required in each delay circuit.
For example, in the aforementioned 64-bit parallel transfer, when data strobe signals are issued for every 4 bits, 16 read control circuits and 64 delay circuits are requested. In each delay circuit, because of the wide range of the delay amount, 8 delay paths and 36 delay elements (transistors) have been required, as shown in FIG. 9.
Namely, when viewed from a single read control circuit, 144 (=4×36) delay elements are required. Further, when viewed from one channel (=64 bit parallel), 16 times thereof, i.e. 2,304 delay elements are required. This necessitates a large mounting area on the circuit (chip), which impedes miniaturization and low cost. Also since the power consumption becomes large, it has been difficult to produce the chip with low power consumption. In addition, the delay element using transistors has a large dispersion of the delay amount, producing a deduced delay accuracy, and impeding high-speed synchronous transfer.