Not applicable.
1. Field of the Invention
The present invention generally relates to a computer system that includes one or more random access memory (xe2x80x9cRAMxe2x80x9d) devices. More particularly, the invention relates to a computer system with RAM devices in which a large number of pages in each RAM device can be activated simultaneously. Still, more particularly, the invention relates to a mechanism to track and effectively manage the status of all potentially activated RAM pages.
2. Background of the Invention
Superscalar processors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. On the other hand, superpipelined processor designs divide instruction execution into a large number of subtasks which can be performed quickly, and assign pipeline stages to each subtask. By overlapping the execution of many instructions within the pipeline, superpipelined processors attempt to achieve high performance.
Superscalar processors demand low memory latency due to the number of instructions attempting concurrent execution and due to the increasing clock frequency (i.e., shortening clock cycle) employed by the processors. Many of the instructions include memory operations to fetch (xe2x80x9creadxe2x80x9d) and update (xe2x80x9cwritexe2x80x9d) memory operands. The memory operands must be fetched from or conveyed to memory, and each instruction must originally be fetched from memory as well. Similarly, processors that are superpipelined demand low memory latency because of the high clock frequency employed by these processors and the attempt to begin execution of a new instruction each clock cycle. It is noted that a given processor design may employ both superscalar and superpipelined techniques in an attempt to achieve the highest possible performance characteristics.
Processors are often configured into computer systems that have a relatively large and slow main memory. Typically, multiple random access memory (xe2x80x9cRAMxe2x80x9d) modules comprise the main memory system. The RAM modules may be Dynamic Random Access Memory (xe2x80x9cDRAMxe2x80x9d) modules or RAMbus(trademark) Inline Memory Modules (xe2x80x9cRIMMxe2x80x9d) that incorporate a DRAM core (see xe2x80x9cRAMBUS Preliminary Information Direct RDRAM(trademark)xe2x80x9d, Document DL0060 Version 1.01; xe2x80x9cDirect Rambus(trademark) RIMM(trademark) Module Specification Version 1.0xe2x80x9d, Document SL-0006-100; xe2x80x9cRambus(copyright) RIMM(trademark) Module (with 128/144 Mb RDRAMs)xe2x80x9d Document DL00084 Version 1.1, all of which are incorporated by reference herein). The large main memory provides storage for a large number of instructions and/or a large amount of data for use by the processor, providing faster access to the instructions and/or data than may be achieved for example from a disk storage. However, the access times of modern RAMs are significantly longer than the clock cycle length of modem processors. The memory access time for each set of bytes being transferred to the processor is therefore long. Accordingly, the main memory system is not a low latency system. Processor performance may suffer due to high memory latency.
Many types of RAMs employ a xe2x80x9cpage modexe2x80x9d which allows for memory latency to be decreased for transfers within the same xe2x80x9cpagexe2x80x9d. Generally, RAMs comprise memory arranged into rows and columns of storage. A first portion of the address identifying the desired data/instructions is used to select one of the rows (the xe2x80x9crow addressxe2x80x9d), and a second portion of the address is used to select one of the columns (the xe2x80x9ccolumn addressxe2x80x9d). One or more bytes residing at the selected row and columns are provided as output of the RAM. Typically, the row address is provided to the RAM first, and the selected row is placed into a temporary sense amplifier buffer within the RAM. The row of data that is stored in the RAM""s sense amplifier is referred to as a page. Thus, addresses having the same row address are said to be in the same page. Subsequent to the selected row being placed into the sense amplifier buffer, the column address is provided and the selected data is output from the RAM. A page hit occurs if the next address to access the RAM is within the same row stored in the sense amplifier buffer. Thus, the next access may be performed by providing the column portion of the address only, omitting the row address transmission. The next access to a different column may therefore be performed with lower latency, saving the time required for transmitting the row address because the page corresponding to the row has already been activated. The size of a page is dependent upon the number of columns within the row. The row, or page, stored in the sense amplifier within the RAM is referred to as an xe2x80x9copen pagexe2x80x9d, since accesses within the open page can be performed by transmitting the column portion of the address only.
Unfortunately, the first access to a given page generally does not occur to an open page, thereby incurring a higher memory latency. Even further, the first access may experience a page miss. A page miss can occur if the sense amplifier has another particular page open, and the particular page must first be closed before opening the page containing the current access. A page miss can also occur if the sense amplifier is empty. Often, this first access is critical to maintaining performance in the processors within the computer system, as the data/instructions are immediately needed to satisfy a miss. Instruction execution may stall because of the page miss while the page containing the current access is being opened.
The more often that instructions can access main memory using page hits, the lower the latency of memory access and the better the system performance. In a memory system containing many RAM devices and thus a large number of sense amplifier buffers, a large amount of memory can be accessed using page hits, resulting in an increased opportunity to maximize performance. Prior art system and methods cannot take advantage of this opportunity since they are able to track on the order of four to sixteen activated pages in the memory system page table. Such prior art systems must close pages in the page table when the page table is full, further reducing memory system performance. Thus, such systems are unable to exploit the potential performance improvements of large memory systems that can have over 1000 pages open. These systems require activation of pages that could otherwise have been avoided had more pages been tracked causing inferior memory system performance. Thus, a system and method is needed to track and effectively manage the status of all potentially activated RAM pages.
The problems noted above are solved in large part by a computer system that contains a processor including a memory controller containing a page table, the page table organized into a plurality of rows with each row able to store an address of an open memory page. A RIMM module containing RDRAM devices is coupled to each processor, each RDRAM containing a plurality of memory banks. The page table increases system memory performance by tracking open memory pages. Associated with the page table is a bank active table that indicates the memory banks in each RDRAM device having open memory pages. The page table enqueues accesses to the RIMM module in a precharge queue resulting from a page miss caused by the address of an open memory page occupying the same row of the page table as the address of the system memory access resulting in the page miss, each entry in the precharge queue closing the page in the memory bank referenced by the address stored in the page table row. The page table also enqueues accesses to system memory in a Row-address-select (xe2x80x9cRASxe2x80x9d) queue resulting from a page miss caused by a row of the page table not containing any open memory page address, the entry in the RAS queue activating the page from the memory bank that caused the page miss and storing the page address into the row of the page table not containing any open memory page address to indicate that the page is open. The page table enqueues accesses to system memory resulting in page hits to open memory pages in a Column-address-select (xe2x80x9cCASxe2x80x9d) queue, each entry in said CAS queue performing a read or write to the memory device. An entry in the precharge queue after completion is then enqueued into the RAS queue. An entry in the RAS queue after completion is enqueued into the CAS Read queue or CAS Write queue.