This invention relates to computer systems and more particularly to memory control mechanisms and techniques employed within computer systems. This invention also relates to performance enhancement and optimization of memory control mechanisms for computer systems.
A variety of techniques have been developed to increase the overall processing speed of computer systems. While improvements in integrated circuit processing technologies such as sub-micron processing capabilities have made it possible to dramatically increase the speed of the integrated circuitry itself, other developments in the architectures and bus transfer mechanisms of computer systems have also led to improvements in performance. Exemplary developments include the incorporation of cache memory subsystems as well as code pre-fetching mechanisms within computer systems.
Typically, accesses to main memory are a performance bottleneck in today""s computer systems. For example, when using dynamic random access memory (DRAM) or synchronous DRAM (SDRAM), the need to pre-charge before a memory access can degrade performance.
Main memory is usually divided up into chip select""s (CS) which are further divided into banks and an array of rows and columns. A particular CS, bank and row is called a page. In order to access data within the memory, first the bank must be pre-charged, then the row must be activated. Once activated, the row of data is moved from the memory array into a row buffer (for which there is one per bank) on the memory chip from which access to the data occurs. If a row is already active, a subsequent request to data within that row can access the data from the row buffer. This scenario is called a page hit (PH). A PH has the lowest latency (highest performance) because there is no need to pre-charge or activate. A subsequent request to a different, inactive row within a bank that already has an active row from a prior request in the row buffer is called a page conflict (PC). A PC is the highest latency scenario because it requires that the bank be pre-charged and then activated again. Further, the prior request which activated the row currently within the row buffer may not be completed yet. This will cause the subsequent PC request to be stalled because the bank cannot be pre-charged until the prior request completes.
The combination of these request scenarios, the random nature of memory requests, the ability to retain a page within a row buffer and the time delays involved in accessing the memory give rise to techniques for maintaining the most efficient use of the memory. These techniques are referred to as xe2x80x98page policies.xe2x80x99 A page is defined as being xe2x80x9copenxe2x80x9d if it has been activated into the row buffer and any data written to the page has not been updated back into the memory array. A pre-charge operation to a bank will close any page that is currently open in the row buffer and write it back to the memory array. A CS is defined as xe2x80x9copenxe2x80x9d if there are open pages in any of its row buffers. A CS is xe2x80x9cclosedxe2x80x9d once all open pages are xe2x80x9cclosedxe2x80x9d by being written back to the memory array.
A first policy is called an xe2x80x9copenxe2x80x9d page policy. In an open page policy, once a page is activated and moved into the row buffer for a current request, it is left there after the request completes. The page is only xe2x80x9cclosedxe2x80x9d when another request is directed to a different, inactive row within the same CS and bank (a PC). This policy is effective and efficient if the majority of the requests to the memory are expected to be PH""s since PH requests take the least time to complete. These requests would only have to perform the read or write operation to the row buffer. A problem with this policy occurs, however, if the majority of the memory requests turn out to be PC""s. The PC scenario takes the longest time to complete, especially if the prior request has not finished yet, and therefore there is a risk of a significant performance loss.
A second policy is the xe2x80x9cClosed Pagexe2x80x9d policy. Under this policy, only one CS is allowed to be open (have open pages) at any given time. In effect, when a request is received for a different CS, the currently open CS is closed. Part of the process of closing the CS is to send a pre-charge command to that CS. This will force any active pages in the row buffers to be closed. Closing the open CS with a pre-charge command leaves that CS in a pre-charged state for any subsequent requests. Under this policy, there will be fewer PH scenarios occurring and some PH opportunities will be lost (PH""s will only occur for requests to any active pages within the one open CS) but the tradeoff is less risk of having PC""s occur.
While implementing a closed page policy will greatly reduce the PC""s, it will not eliminate them. Multiple requests to different rows of the same bank within any one CS will still cause PC""s to occur even under a closed page policy.
Accordingly there is a need to optimize and enhance the performance of memory accesses to the memory by reducing the occurrence of page conflicts when there are multiple requests to different rows within the same bank of a particular CS and reducing the latency/penalty following page conflicts when they do occur.
The problems outlined above are solved by an apparatus and method for optimizing memory requests to a computer memory according to the present invention. In one aspect of the invention, there is provided a memory controller for controlling requests to a computer memory wherein the computer memory is divided into at least one chip select coupled to the memory controller, where the chip select is further divided into banks and further wherein the memory requests include a first request directed to a first one of the banks. This apparatus includes a request dispatcher coupled to the memory controller which is operative to transmit the first request to the memory controller and a request acceptance indicator coupled to the memory controller which is operative to indicate that the first request has been accepted. Further, the apparatus includes a request size calculator coupled to the request dispatcher which is operative to indicate that the first request is an eight quadword access, a table which stores a plurality of data entries representing currently active banks and corresponding active rows and a bank comparator coupled to the request dispatcher and the table which determines that the first request is a page hit or page miss. The apparatus also includes an active bank selector coupled to the table which is operative to select a second bank from the plurality of data entries and a pre-charge generator responsive to the request acceptance indicator, the request size calculator, the first and second bank comparators and the active bank selector. The pre-charge generator generates a pre-charge operation to the second bank one clock cycle after the first request has been accepted by the main memory when the second bank is active and is different from the first bank and when the first request is an eight quadword request and a page hit or page miss.
The present invention further contemplates a method for optimizing memory requests to a computer memory wherein the computer memory is divided into at least one chip select coupled to a memory controller, the chip select being further divided into banks and further wherein there is a first memory request directed to a first one of the banks, the method comprises the steps of: sending the first request to the memory controller and determining that the memory controller has accepted it; determining that the first request is an eight quadword page hit or page miss; determining that there is a second bank active; generating a pre-charge operation to the second bank one clock cycle after the first request has been accepted by the main memory when the second bank is active and when said second bank is a different bank then the first bank and when said first request is an eight quadword request and a page hit or page miss.
As a result of this invention, memory accesses to the SDRAM main memory are optimized and the memory performance is increased by reducing the occurrence of page conflicts when there are multiple requests to different rows within the same bank of a particular chip select and reducing the latency/penalty following page conflicts when they do occur.