1. Field of the Invention
The present invention is related to the field of semiconductor memory devices, and more specifically to a memory device with prefetched data ordering that is distributed in prefetched data path logic, and a method of ordering prefetched data.
2. Description of the Related Art
Memory devices are used in electronic devices for storing data. As there is continuing competitive pressure to make electronic devices faster, the memory device is often found to be a limitation in the speed of the overall device. Indeed, sometimes the memory device requires its own, internal clock for its operation, which is slower than the external clock of the overall device. And as there is continuing competitive pressure for devices of larger capacity, there is pressure to make memories larger, which further restricts how fast they can become.
An example of a memory device 100 in the prior art is shown in FIG. 1. While salient parts are explained in this description, more details can be found in a number of references, such as U.S. Pat. No. 6,115,321.
Memory device 100 includes a memory cell array (MCA) 102. Array 102 has cells, such as cell 104. One data bit is stored at each cell 104. The cells are arranged at intersections of rows, such as wordline 106, and columns 108. Columns 108 are also called local input/output (I/O) lines 108.
A number of local I/O lines 108 terminate in a single local sense amplifier LS/A 110A. A number of such local sense amplifiers are provided, similar to LS/A 110A. Out of each local sense amplifier there emerges a Global I/O (GIO) line. Eight such GIO lines 114A-114H are shown as a group.
Reading data from memory device 100 entails outputting the bit stored in cell 104 to one of GIO lines 114, and from there to a DQ pad 120. All DQ pads 120 feed their data to a cache memory 122, or other kinds of electronic devices requiring data storage.
In memory devices such as device 100, the problem of speed has been addressed in the prior art by prefetching data that is to be read. That means reading many data simultaneously out of the memory device 100 for a single DQ pad, in response to a single address input. This is a core DRAM operation.
With prefetching, as the data is output from GIO lines 114, it needs to be ordered, before it is output to DQ pads. If not, then the electronic device reading data from the memory device may have to wait too long before it receives the necessary data.
Ordering of the data is accomplished in device 100 by having all GIO lines 114A-114H from array 110 come together in a data sequencing block 118, before reaching DQ pad 120. Block 118 receives eight inputs, one from each data path, and outputs the same eight inputs in the desired order, subject to ordering signals.
The ordered data is then serialized, by a serializing block 119. Block 119 receives all the inputs, and outputs them one by one to DQ pad 120.
Referring now to FIG. 2, a portion 118-1 of data sequencing block 118 is shown. It will be appreciated that only 4 inputs and 4 outputs are shown in portion 118-1. Since it has eight inputs, actual block 118 is commensurately larger.
Block 118 occupies space that is desirable to allocate elsewhere in the memory device. In addition, as external data rates increase, the number of prefetched data words is increased, and thus block 118 must become commensurately larger. For example, to handle twice the number of inputs would require four times the complexity and size. That would make it occupy even more space on the device 100.
Referring now to FIG. 3, the prefetched data is received by local sense amplifiers 110A LS/A 110H. The data is then advanced on the GIO lines 114A-114H, and then optionally passed through respective Input/Output Sense Amplifiers (I/OSA) 124A-124H, upon exiting MCA 102. The data is then advanced along respective individual operation blocks (also known as pipelines) 144A-144H, prior to reaching the data sequencing block 118. Accordingly, the data may be operated on as it is being advanced along pipelines 144A-144H.
In the large majority of cases, pipelines 144A-144H are identical to each other, as identical operations are performed for all read out data. Furthermore, sometimes it is advantageous that pipelines 144A-144H be decomposed into sequential stages. Each such stage is appropriately called a pipe, and performs only one of the operations.
Referring now to FIG. 4, a detail of pipeline 144A is shown. A more detailed explanation can be found in U.S. Pat. No. 5,802,596.
Pipeline 144A includes a first stage pipe 221, a second stage pipe 222 and a third stage pipe 223. The input signal enters the first stage pipe 221 and exits the third stage pipe 223. A first gate 231 is interposed between the first stage pipe 221 and the second stage pipe 222. A second gate 232 is interposed between the second stage pipe 222 and the third stage pipe 223. First gate 231 and second gate 232 are controlled by the clock signal through respective delay circuits 241, 242. As such, data is processed along pipeline 144 at the speed of the clock.
Referring now to FIG. 5, a circuit is shown for first gate 231. It will be observed that it receives a signal from previous stage 221, and outputs it to next stage 222. It operates from a latch signal Lt, of a clock.
The present invention overcomes these problems and limitations of the prior art.
Generally, the present invention provides a memory device that is adapted for prefetching data, and a circuit and a method for reordering data within paths. The memory device of the invention has a memory cell array, with local sense amplifiers for receiving data bits prefetched from the memory cell array. The memory device of the invention also includes a serializer, and data paths that connect the local sense amplifiers to the serializer.
The invention additionally provides crossover connections interposed between stages of the data paths. These may transfer data bits from one of the data paths to another, before exiting the date path. Preferably they do that as part of being connecting switches between the stages. The stages are in turn controlled by an internal clock signal.
The invention offers the advantage that ordering is distributed within the data paths, and thus does not limit how fast the data rate may become. In addition, the space used remains at a fundamental minimum.
The invention will become more readily apparent from the following Detailed Description, which proceeds with reference to the drawings, in which: