There has been, for some time, a trend toward larger memory capacity in computers. Although larger memory capacities provide a number of well known advantages, several difficulties are also encountered with large memory systems. In general, as memory size increases, the time needed to access memory also increases, other factors remaining equal.
In many modern computing systems, the memory is provided in a hierarchical scheme. In such a scheme, a large, relatively slow memory is used in combination with a smaller, faster memory, which contains a subset of the larger memory. For example, a main memory containing relatively slow dynamic random access memory (DRAM) is used in combination with a smaller static random access memory (SRAM), often referred to as a "cache." Other hierarchies include providing a large memory, in the form of relatively slow disk storage, used in combination with the relatively faster DRAM main memory. A memory hierarchy might contain all three of these levels: long-term disk storage, main DRAM storage, and SRAM cache storage.
In many memory systems, a virtual addressing method is used. The virtual address is an address which contains enough bits to provide a unique identification for each user-accessible memory location. Physical memory is accessed through a physical address which must be mapped-to from the virtual address space.
When a request is made for the contents of memory identified by its virtual address, it must be determined whether the virtual address corresponds to a memory location currently residing in the physical memory. An address "corresponds" to another address if each corresponds to the same memory location.
One or more tables are usually provided to translate a virtual address to a corresponding physical address (if there is a correspondence). A look-up procedure for such a table is often relatively slow. The table typically contains a translation only of blocks of memory, often referred to as "pages." Fortunately, it has been found that references to the page table exhibit a locality, i.e., of all the possible virtual pages that might need to be looked up in the page table, during any one short period of time, there is a tendency for a few of these pages to be repeatedly looked up. This locality permits a certain saving of time by providing a second smaller and faster table, referred to as a "page table cache" (PTC) (or, sometimes, a "translation lookaside buffer"), which is used to contain the most recently accessed entries from the larger page table. The PTC thus contains a subset of the page table which is likely to contain the entries which will be subsequently requested. The PTC includes two arrays. One array is the PTC entry or data array, which is the virtual-to-physical address map. The other array is the PTC tag, which is used to determine whether or not the data in the PTC entry is valid.
In addition to the locality exhibited by the page table, a certain locality is also exhibited by the memory itself. According to this locality phenomenon, a given reference to a memory location is likely (i.e., with greater probability than expected from randomness) to be followed, within a relatively short period of time, by a request for a nearby memory location. The probability distributions, which define "nearby" and "relatively short period of time," can be determined empirically for a given memory system and computing task. This locality has been exploited in several ways to decrease average memory access time. One method of this exploitation is the provision of a fast page dynamic random access memory (FPDRAM).
FPDRAM can be best understood by contrasting it with ordinary memory access. In a typical DRAM, memory locations are addressed by row and column, with each row containing elements with contiguous addresses. In normal access, a row address is presented and strobed into a latch with a row address strobe (RAS), which is typically asserted. Later, a column address strobe (CAS) is presented and asserted to perform the read or write of the DRAM. Each new access must go through the entire cycle. For this reason, each access requires presenting and strobing the row address and, subsequently, strobing a column address. Because two addresses, even if relatively close together, must be sequentially strobed for any access in the memory, this type of access does not take full advantage of the memory locality.
In contrast, FPDRAM takes advantage of memory locality, i.e., situations in which access to a memory location in a particular row is relatively likely to be followed, within a short time, by a request for a memory location in the same row. In this case, it is possible to leave RAS asserted for a relatively long period, during which multiple accesses to the selected row (defined by the contents of a row address latch) may be performed. The multiple accesses are achieved by sequentially presenting column addresses while a single row address is continuously asserted. For each access to memory in a row which has been previously accessed (i.e., a row addressed by the contents of one of the row address latches), only assertion of CAS is required, i.e., it is not necessary to sequentially assert RAS and then assert CAS for each access. In this way, when a subsequent memory request is made for a location in the same row, this subsequent address is available for reading or writing by the relatively fast procedure of presenting a new column address and asserting the column address strobe (CAS). When a request is made for a memory location in another row (assuming there are no other FPDRAM row address latches usable for this request), a normal access procedure is followed, i.e., sequential loading and strobing of a row address, then a column address. In this way, to the extent that subsequent memory accesses are to locations in the same row of memory, relatively fast FPDRAM memory access is used, rather than the slower normal memory access.
To take advantage of the speed of an FPDRAM, it is necessary to determine, for any memory request, whether that request is for a memory element which resides in one of the rows corresponding to a row address in one of the row address latches. In previous systems known to the Applicant, an indication of the physical row address for the row or rows most recently accessed was stored. When a physical address request was made, a comparison was performed to determine whether the requested address was in a row recently addressed, i.e., for which the RAS was still being asserted. When a virtual address request was made, the virtual address was first translated into a physical address, and then the comparison was made with the stored physical addresses.
Even with the relatively fast access provided by FPDRAM, memory access is still a limiting factor in many systems, particularly those with large memories. Such large memory systems typically are limited by the speed of a cache-fill operation and the write bandwidth.