The present invention relates to data processing systems with memory subsystems. More particularly, the present invention relates to controlling requests to memory subsystems so as to maximize bandwidth and concurrency, thereby increasing overall memory subsystem and data processing system speed.
In modern data processing systems, the speed of memory subsystems can be a major limiting factor on overall system speed. The memory bottleneck exists because a memory access is typically much slower than the speed at which computer processors and data buses can generate and convey memory access requests. The slow speed of memory access is particularly felt when there is a read request, as opposed to a write request, because a read request indicates that a requesting processor may be waiting for data.
The bottleneck caused by low memory speed becomes even more severe as the speed of computer processors increases at a faster rate than the speed of common memory components. The memory bottleneck is also exacerbated as computer system and network architectures are introduced that contain multiple processors which share a memory subsystem.
One conventional approach to alleviate the memory bottleneck is to use data caching, perhaps at various levels within the data processing system. For example, portions of data in a slow, cheap disk memory subsystem may be copied, or “cached,” into a faster system RAM (random access memory) subsystem. Portions of data in system RAM may in turn be cached into a “second-level” cache RAM subsystem containing a small amount of expensive, even faster RAM. Portions of data may also be cached into yet faster “first-level” cache memory which may reside on the same chip as a processor. Data caching is a powerful technique to minimize accesses to slower memory. However, at some point, the various levels of memory still need to be accessed. Therefore, whether or not caching is employed, techniques to speed up memory access are still needed.
Attempts to speed up memory access have included the organizing of memory into multiple banks. Under this memory architecture, as a first bank of memory is busy servicing a request to access a memory location in the first bank, a second, available bank can begin servicing the next memory access request if the next request targets a memory location in the second bank. Memory locations may be interleaved among the banks, so that contiguous memory addresses, which are likely to be accessed sequentially, are in different banks.
A problem with the conventional use of memory banks is that successive access requests will still sometimes target addresses within a common bank, even if addresses are interleaved among the banks. In this situation, a conventional memory subsystem must still wait for the common bank to become available before the memory subsystem can begin servicing the second and any subsequent requests. Such a forced wait is wasteful if a subsequent third access request could otherwise have begun to be serviced because the third request targets a different, available memory bank. Furthermore, merely organizing memory into interleaved banks does not address the extra urgency that read requests have versus write requests, as discussed above.
What is needed in the art is a way to control access to memory subsystems so as to maximize bandwidth and concurrency by minimizing the amount of time that memory requests must wait to be serviced. In particular, a way is needed to allow a memory subsystem to begin servicing a request to access an available memory location even if a preceding request cannot yet be serviced because the preceding request targets an unavailable memory location. Furthermore, a way is needed to give extra priority to read requests, which are more important than write requests, especially in “posted-write” systems in which processors need not wait for a memory write to fully complete before proceeding to the next task.