FIG. 1 shows a generalized organization of a memory system, comprising address input register 100, x-and y-decoders 101 and 102 (also known as row and column decoders), a memory array 103, a sense amplifier circuit 104, a memory output register 105, and output buffers 106. In this organization, the address input register 100 receives a two-part address, which is decoded by the x- and y-decoders 101 and 102 to select the corresponding memory cell in memory array 103. The content of the selected memory cell is read by the sense amplifier circuit 104 and latched into the memory output register 105. Register 105 is typically a "see through" latch, such that a transition at the input of the latch is immediately reflected at the output of the latch. "Master-slave" latches are not used in this application because they require two clock edges to operate, therefore necessarily requiring a slower speed of operation. The output buffers 106, having greater current sourcing and sinking capabilities than the memory register 105, provide the output data at specified voltage levels to receiving devices external to the memory system.
In many applications, successive accesses to the memory system are often made to contiguous memory locations. This pattern of memory access, called "sequential access," may be exploited to implement a high performance memory system. One method of taking advantage of this access pattern is by latching into memory output register 105, in addition to the datum corresponding to the specified address, data corresponding to memory cells having addresses contiguous to the specified address. That is, "pre-fetching" data into the memory output register 105 in anticipation of contiguous accesses immediately following. Hence, by storing the additional data fetched in registers, subsequent data may be made available in the period of time required to read each register, which is a time period shorter than that required for the first or "initial" access. With prefetching, the total throughput time for completing a number of sequential accesses is significantly reduced when compared to the total time of individual accesses without prefetching.
FIG. 2 shows an example of a timing scheme in a system having an organization such as shown in FIG. 1. The first data output is provided after a total access time (tAA), as measured from the time address data is made available to the address input register 100, to the time when data output is made available at the memory output buffers 106. In this example, tAA has two components: (i) core access time (tASA), i.e., the period of time between when address data is ready at the address input register 100 to the time when data is ready to be latched at the input terminals of memory output register 105; and, (ii) output enable time (tRCO), i.e. the period of time between when the memory output register 105 is provided an enable signal ("clock") to gate the content of the register onto the output terminals of the register, to the time data output is ready at the output buffers 106. In the ideal case, i.e. data are latched as soon as they are made available to the memory output register 105, tAA is the sum of tASA and tRCO. In the ideal system, where maximum memory access overlap is exploited, the next data stored in next contiguous addresses are ready at the memory output buffers 106 every tRCO after the initial access, rather than every tAA, as required for the initial access. In this mode of access, called "burst" mode, only the initial address is specified, and data from contiguous addresses are provided sequentially thereafter until the burst mode is terminated, or when all the prefetched data are output.
In the prior art, such as the implementation shown in FIG. 3, the ideal speed-up is limited by the number of data prefetched, since an initial access must be made after the last datum is read from the registers in which the prefetched data are stored. FIG. 3 shows, for example, a memory system 30 similar to the memory system shown in FIG. 1 organized such that each bit is selected by the two-part address as discussed above; the row (x-) address part is stored in address counter 300, and the column (y-) address A0-A3 selects which register of the memory output registers R0-R7 is output. This memory system 30's output is 8-bit wide. In this organization, sixty-four bit lines are activated simultaneously, so that the memory array 303 provides simultaneously to registers R0 through R7 sixty four bits (8 bytes) corresponding to eight 8-bit data from eight contiguous addresses. Each 8-bit datum can therefore be provided for output sequentially by selectively enabling the outputs of registers R0 and R7 in order of each datum's address. The necessary enabling signals, or clock signals, are provided by the control logic 307. The registers R0 through R7 are also provided output buffers. However, the maximum number of bytes output in burst mode is limited to the width of the row, i.e. eight in this example. To receive the next eight bytes of data contiguously in the next row, or in the same row, the device requesting memory access must go into a wait state or "stutter", as it is known in the art, until another initial access is made to the required data. It is desirable to have a memory system in which all subsequent accesses are provided in burst mode after the initial access, regardless of whether row boundaries are crossed.