A system and apparatus for controlling data access in a memory system having memory devices with a retire buffer is disclosed. The method and apparatus are particularly well adapted for use in a memory system implementing write requests in two steps: transport and retire.
During the last several decades, memory technology has progressed dramatically. The density of commercial memory devices, taking Dynamic Random Access Memory (DRAM) as a convenient example, has increased from 1 Kbit to 64 Mbits per chip, a factor of 64,000. Unfortunately, memory device performance has not kept pace with increasing memory device densities. In fact, memory device access times during the same time period have only improved by a factor of 5. By comparison, during the past twenty years, microprocessor performance has increased by several orders of magnitude. This growing disparity between the speed of microprocessors and that of memory devices has forced memory system designers to create a variety of complicated and expensive hierarchical memory techniques, such as Static Random Access Memory (SRAM) caches and parallel DRAM arrays.
Further, now that computer system users increasingly demand high performance graphics and other memory hungry applications, memory systems often rely on expensive frame buffers to provide the necessary data bandwidth. Increasing memory device densities satisfy the overall quantitative demand for data with fewer chips, but the problem of effectively accessing data at peak microprocessor speeds remains.
In conventional DRAMs, including Extended Data Output (EDO) devices and SRAMs, a simple protocol is used for memory access. Memory access requests typically include a Row Address Strobe (RAS or Row) request, followed by one or more Column Address Strobe (CAS or Column) requests, and an optional Precharge request. Other well known maintenance requests, such as Refresh, are also performed, but these requests need not be reviewed in detail to understand the present invention.
In an attempt to improve data access speed, conventional memory systems xe2x80x9cpipelinexe2x80x9d memory access requests on the bus(es) to improve efficiency. Pipelining is a form of bus multiplexing which communicates a time-multiplexed sequence of memory access requests from a memory controller to one or more memory devices.
While pipelining memory access requests generally improves efficiency, it also creates problems. For example, differing timing and physical constraints for the memory system resources involved in memory requests can stall the pipeline for certain access request sequences. At a minimum, the xe2x80x9cbubblesxe2x80x9d formed in the flow of data by these constraints increase data access time, thus reducing the access efficiency initially sought by implementing a memory system with pipelining.
An example of a data bubble is illustrated in FIG. 1. In the example, three signal groups are related in time. A sequence of read and write requests are sent via a control bus 10. Data is sent via a separate data bus 12. Column I/O signaling (or a column I/O resource 14) accesses the internal core of the DRAM in response to the read/write requests. The example assumes a series of write requests followed by a read request. In this example, the unavailability of the column I/O resource 14 causes a data bubble on data bus 12 as the memory system executes the second write request 1A and thereafter the read request 2A.
In particular, packet commands indicating a write request 1A appears on control bus 10, followed by corresponding write data 1B which appears on data bus 12 a short time later. Once the write data 1B appears on data bus 12, the column I/O resource 14 actually accesses the DRAM core and writes the data 1C into memory. However, well before column I/O resource 14 is finished performing write request 1C, a packet command indicating a read request 2A appears on control bus 10. The read request can not be performed until the column I/O resource 14 becomes available. The resulting time lag in read data 2C appearing on data bus 12 results from the additional time required to finish write 1C, to make column I/O resource 14 available, and to perform read request 2B.
In the conventional pipelined memory system, data access stalling occurs with each data bubble created when a read request follows a write request, (or when a read request follows multiple write requests, as determined by the particular timing requirements of the memory system). Such repeated stalling unacceptably lowers data access efficiency in the memory system.
Of further note, conventional memory controllers have been designed which include write by-pass buffers and write stall buffers. In fact, write by-pass buffers have often been incorporated further xe2x80x9cup-streamxe2x80x9d from the memory controller in the microprocessor. Write bypass buffers simply store a write command, including the write data, in a buffer in a microprocessor or memory controller. If a following read command is directed to the address of the buffered write command, the normal data read routine is by-passed and the desired data is taken directly from the write by-pass buffer. Write stall buffers uniformly delay the execution of all write commands by some predetermined period of time.
However, conventional write by-pass and stall buffers do not solve the problem of pipeline data bubbles. Further, such conventional write buffers hold write data outside the memory system, and in particular hold write data outside the memory device.
In a memory system executing write requests using transport and retire steps in a system comprising a memory controller and memory devices having a retire buffer, the present invention insures data accuracy and proper scheduling of memory access requests. Read requests following one or more write requests are evaluated on the basis of their address, or an address component, in relation to the address associated with one or more write requests having un-retired write data.
In one aspect, the present invention incorporates an un-retired write buffer and at least one comparator circuit in a memory controller. When a write request is received in the memory controller, an address, whether a partial address or a full address, associated with the write request is stored in the un-retired write buffer. Two or more consecutive write requests may have an associated write address queued in the un-retired write buffer.
Write data stored in one or more retire buffers in a memory device is inherently retired according to the memory system constraints, most notably bus timing constraints.
When a read request is received in the memory controller following one or more write requests, the read address associated with the read request is compared to write address(es) stored in the un-retired write buffer. If the read address matches a write address in the un-retired write buffer, the read request is stalled in the memory controller until such time as the write data corresponding to the matching write address is retired.
If the read address does not match any write address(es) in the un-retired write buffer, the read request is issued and may be executed before the write data stored in one or more retire buffers is retired into memory.
According to this aspect of the present invention, only read requests to addresses having a pending retire operation cause a stall in the memory controller. All other read requests may immediately executed. In this manner, read data accuracy is preserved while maintaining the data efficiency offered by a pipelined memory system and a transport and retire write request.