Exemplary embodiments of the present invention relate to a data input/output apparatus and method for a semiconductor system, and more particularly, to an apparatus and method for inputting and outputting data at a high speed in a parallel-link transceiver of a semiconductor memory device.
In a system including a plurality of semiconductor memory devices, each of the semiconductor memory devices may be controlled to store data. When a memory controller, for example, a central processing unit (CPU) or the like, is to input data, a semiconductor memory device receives the data and a clock signal from the memory controller, and writes the data into a corresponding memory cell in synchronization with the clock signal.
FIG. 1 is a diagram illustrating a conventional semiconductor memory device and a memory controller for controlling the semiconductor memory device. Specifically, FIG. 1 illustrates a semiconductor memory device for a graphic operation and a graphic process unit (GPU) dedicated to processing image data.
Referring to FIG. 1, a transmission stage of the GPU transmits a plurality of data DQ and a clock signal CLK to the semiconductor memory device.
In a reception stage of the semiconductor memory device, the data inputted from outside are converted into internal data DIN through a variety of control circuits to transmit the data to unit cells, and the clock signal is converted into an internal clock signal ICLK through an internal clock control circuit.
At this time, the circuits for transferring the clock signal, including a phase locked loop (PLL), a clock divider, and a local router, have a longer delay time than the circuits for transferring the data. Therefore, the clock signal CLK outputted from the transmission stage of the GPU is designed in such a manner as to have a rising edge when the data is outputted, and the internal clock signal ICLK used in the reception stage of the semiconductor device is designed in such a manner as to have a rising edge in the center of an effective window of the data.
In a system operating at a high speed, however, the effective window of the data is inevitably reduced. Thus, as the amount of data in a channel between the semiconductor memory device and the GPU increases, the operation time of the data may not coincide with the transition time of the clock signal and data may be received inaccurately.
In addressing such a concern, a semiconductor memory device and a GPU may transfer data between them at a high speed by using data training. The data training refers to technology which controls a skew between data by using a data pattern predefined between the GPU and the semiconductor memory device, in order to stably transfer the data for read and write operations.
A semiconductor memory device for a graphic operation may be designed to transfer data at a speed of 4 Gbps or more by using data training. In order to improve the reliability of a high-speed operation, a semiconductor memory device for a graphic operation often includes the data training as a part of its features.
FIG. 2 is a circuit diagram explaining data training which is used in parallel links of a conventional semiconductor system. Hereafter, data input/output between a memory controller 10 and a semiconductor memory device 30 of the semiconductor system will be described.
Referring to FIG. 2, the memory controller 10 of the semiconductor system includes a plurality of transmission units 10_0 to 10_N−1 and a first clock generation unit 20.
The number of the transmission units 10_0 to 10_N−1 corresponds to the number of parallel data DQ<0:N−1>, and are configured to transmit the corresponding parallel data DQ<0:N−1>.
Each of the transmission units 10_0 to 10_N−1 includes a transmitter 12 and a phase interpolator (PI) 14. The transmitter 12 is configured to transmit the corresponding data to the semiconductor memory device 30. The phase interpolator (PI) 14 is configured to generate a training clock signal TCLK<0:N−1> for controlling the output time of the data outputted from the transmitter 12, that is, the phase of the data in accordance with a multi-phase clock signal PLL_CLK<0:M> and a phase control signal PI_CTRL<0:N−1>. The phase control signal PI_CTRL<0:N−1> is a signal which is generated in accordance with the data training.
The first clock generation unit 20 is configured to generate the multi-phase clock signals PLL_CLK<0:M> in response to a reference clock signal CLK_REF and supply the multi-phase clock signals PLL_CLK<0:M> to the respective transmission units 10_0 to 10_N−1. Furthermore, the first clock generation unit 20 generates a clock signal CLK to be transmitted to the semiconductor memory device 30, in response to the reference clock signal CLK_REF. Here, a transmitter 22 is configured to buffer and output the clock signal CLK outputted from the first clock generation unit 20.
The memory controller 10 additionally includes a transmitter 24 and a PI control signal generation unit 26 to generate the phase control signals PI_CTRL<0:N−1> in accordance with the data training. The transmitter 24 is configured to receive a training result inputted from the semiconductor memory device 30. The PI control signal generation unit 26 is configured to generate the phase control signals PI_CTRL<0:N−1> depending on the received training result.
Meanwhile, the semiconductor memory 30 includes a plurality of reception units 30_0 to 30_N−1, a second clock generation unit 40, and a clock division unit 42.
The number of reception units 30_0 to 30_N−1 corresponds to the number of parallel data DQ<0:N−1>, and are configured to receive the corresponding parallel data DQ<0:N−1> to generate internal data DIN<0:N−1>.
Each of the reception units 30_0 to 30_N−1 includes a receiver 32, a sample holder (S/H) 34, and a local router 36. The receiver 32 is configured to receive data inputted from the corresponding transmitter of the memory controller 10. The sample holder 34 is configured to sample an output of the receiver 32 in accordance with the internal clock signal ICLK. The local router 36 is configured to route the inputted internal clock signal ICLK and provide the internal clock signal ICLK to the sample holder 34.
The second clock generation unit 40 is configured to receive the clock signal CLK to generate the internal clock signal ICLK. In the previous stage of the second clock generation unit 40, a receiver 44 is additionally provided to receive the clock signal CLK inputted from the memory controller 10.
The clock division unit 42 is configured to divide the internal clock signal ICLK outputted from the second clock generation unit 40 and provide the divided internal clock signal ICLK to each local router 36 of the reception units 30_0 to 30_N−1.
The semiconductor memory device 30 further includes a data training unit 50 configured to perform the data training.
The data training unit 50 includes a register 52, a serializer 54, and a transmitter 56. The register 52 is configured to receive and store the internal data DIN<0:N−1> outputted from the plurality of reception units 30_0 to 30_N−1. The serializer 54 is configured to serialize and output the parallel internal data DIN<0:N−1> stored in the register 52. The transmitter 56 is configured to output the output of the serializer 54 to the transmitter 24 of the memory controller 10.
Hereafter, the data training between the memory controller 10 and the semiconductor memory device 30 will be described with reference to the above-described configurations. Here, an independent command may be inputted to perform the data training, before an input/output operation of data. Hereafter, a period before an independent command is inputted to perform the input/output operation of data is referred to as an initial period.
During the initial period, the respective PIs 14 inside the plurality of transmission units 10_0 to 10_N−1 of the memory controller 10 are initialized. Therefore, the plurality of transmission units 10_0 to 10_N−1 transmit the parallel data DQ<0:N−1> and the clock signal CLK to the semiconductor memory device 30, without the phase control by the phase interpolators (PIs) 14.
The memory semiconductor device 30 generates the internal clock signal ICLK in accordance with the clock signal CLK, and receives the parallel data DQ<0:N−1> in accordance with the internal clock signal ICLK to generate the internal data DIN<0:N−1>. The data training unit 50 receives the internal data DIN<0:N−1>, serializes the internal data DIN<0:N−1>, and transmits the serialized internal data to the memory controller 10.
The PI control signal generation unit 26 of the memory controller 10 receives the serialized data to generate the phase control signals PI_CTRL<0:N−1>, and the phase interpolators (PIs) 14 inside the respective transmission units 10_0 to 10_N−1 generate training clock signals TCLK<0:N−1> in accordance with the phase control signals PI_CTRL<0:N−1>, and control the phases of the data outputted from the transmission units 10_0 to 10_N−1.
Through the data training process, the rising edge of the internal clock signal ICLK of the semiconductor memory device 30 is positioned in the center of the effective window of the internal data DIN<0:N−1>. During the transmission and reception period of the data, the semiconductor memory device 30 samples the data in the center of the effective window of the data in accordance with the internal clock signal ICLK. Therefore, the semiconductor memory device 30 may normally transmit and receive the data to and from the memory controller 10.
In the conventional memory system, however, the data training is performed during the initial period. Therefore, when delay is generated by the internal control circuits while the data are transmitted and received outside the initial period, it is difficult to compensate for the delay. Accordingly, as the transition time of the internal clock signal deviates from the effective window of the data, data may be sampled inaccurately.
In particular, when the temperature or power supply voltage varies during the operation, the clock signal having a different transmission path from the data may have a different delay value and thus the sampling timing of the semiconductor memory device 30 may vary and deviate from the center of the effective window of the data. As the operation frequency of the semiconductor memory device increases, the above-described situation may frequently occur and cause malfunctions.
Here, as the data transmission rate increases, the clock frequency also increases. For example, a semiconductor memory device having a data transmission rate of one Gb/s (giga bits/sec) or more has a clock frequency of one GHz (giga hertz) or more. However, at such a clock frequency, it is difficult to divide the clock signal and to discriminate the data on the chip. Therefore, a method for inputting and outputting data and a clock signal effectively without an error may be useful.