The invention is generally related to data processing systems and processors therefor, and in particular to retrieval of data from a shared memory architecture.
Computer technology continues to advance at a remarkable pace, with numerous improvements being made to the performance of both microprocessorsxe2x80x94the xe2x80x9cbrainsxe2x80x9d of a computerxe2x80x94and the memory that stores the information processed by a computer. In general, a microprocessor operates by executing a sequence of instructions that form a computer program. The instructions are typically stored in a memory system having a plurality of storage locations identified by unique memory addresses. The memory addresses collectively define a xe2x80x9cmemory address space,xe2x80x9d representing the addressable range of memory addresses that can be accessed by a microprocessor.
A number of computer designs utilize multiple microprocessors operating in parallel with one another to increase overall computing performance. In a symmetric multiprocessing (SMP) environment, for example, multiple microprocessors share at least a portion of the same memory system to permit the microprocessors to work together to perform more complex tasks. The multiple microprocessors are typically coupled to one another and to the shared memory by a system bus or other like interconnection network.
Many shared memories use multiple levels and arrangements of memory sources to increase system performance in a cost-effective manner. A shared memory, for example, may utilize a relatively large, slow and inexpensive mass storage system such as a hard disk drive or other external storage device, an intermediate main memory that uses dynamic random access memory devices (DRAM""s) or other volatile memory storage devices, and one or more high speed, limited capacity cache memories, or caches, implemented with static random access memory devices (SRAM""s) or the like. One or more memory controllers are then used to swap the information from segments of memory addresses, often known as xe2x80x9ccache linesxe2x80x9d, between the various memory levels to attempt to maximize the frequency that memory addresses requested by a memory requester such as a microprocessor are stored in the fastest cache memory accessible by that requester. In a typical SMP environment, for example, each microprocessor may have one or more dedicated cache memories that are accessible only by that microprocessor (e.g., level one (L1) data and/or instruction caches, and/or a level two (L2) cache), as well as one or more levels of caches and other memories that are shared with other microprocessors in the computer.
Whenever more than one microprocessor shares access to a memory, a number of concerns arise. One concern is that the status of all data stored in a shared memory is kept current throughout the memory, a process known as maintaining xe2x80x9ccoherence.xe2x80x9d Otherwise, a microprocessor might access stale data from a memory address stored in one memory source that has not been updated to reflect changes made to another copy of the same data stored in another memory source, which could lead to unpredictable results in the computer.
To maintain coherence, memory requesters such as microprocessors are granted exclusive or shared xe2x80x9cownershipxe2x80x9d of certain cache lines in response to requests issued to a shared memory over a system bus. Further, a process known as xe2x80x9caddress contentionxe2x80x9d is used to arbitrate between multiple requesters that attempt to access the same cache line at the same time, such that only one requester is granted access to the cache line at any given time. Requests are also tracked by a central directory or via a distributed mechanism known as xe2x80x9csnoopingxe2x80x9d to maintain up to date information about the currency of the data stored in each memory source. With snooping, each memory source maintains local state information about what data is stored in the source and provides such state information to other sources over the system bus in the form of a response, so that the location of valid data in the shared memory address range can be ascertained.
Memory requests issued by microprocessors, and the responses returned by the memory sources in the shared memory, can occupy an appreciable amount of bandwidth on a system busxe2x80x94so much so that system performance can be hindered by an excessive number of requests being issued by microprocessors. Of particular concern are requests issued for cache lines that are currently owned by other microprocessors and requests issued for cache lines where the ownership is in transition, since such requests are initially denied, and must later be reissued should the requester still desire to access the requested cache lines. A fixed delay is typically used by each requester that determines when a request is reissued by the requester. With the future availability of a cache line being unknown to a requesting microprocessor, however, deciding precisely when to reissue a request can have a significant impact on system performance.
From the perspective of a requesting microprocessor, the optimum performance occurs when a request is issued as soon as possible after a requested cache line becomes available. As such, one option is to reissue requests as often as possible until the requested cache line becomes available. However, given the limited bandwidth of a system bus, repeated unsuccessful requests can reduce the amount of bandwidth available for requests for other cache lines. Thus, other microprocessors, which may not need to access the same cache line, may nonetheless be slowed by backups on the system bus. The other option is to increase the delay before a request is reissued, but doing so may increase the delay before the request is ultimately granted, thus slowing the performance of the requesting microprocessor.
Consequently, a trade-off currently exists between minimizing the delay or latency associated with providing access to contending cache lines and maintaining suitable bandwidth on a system bus for performing operations on other cache lines in a shared memory. As such, a significant need exists for an improved manner of balancing these concerns and thereby increasing overall system performance.
The invention addresses these and other problems associated with the prior art by providing a data processing system, circuit arrangement, integrated circuit device, program product, and method that controllably vary the amount of delay before reissuing a request based upon the detection of one or more requests being issued by other requesters coupled to a shared memory.
For example, in one particularly beneficial, but not exclusive application, the number of requests issued by other requesters for a common memory address (e.g., a common cache line) as a particular request is tracked and utilized to controllably vary reissue delay. Moreover, while other relationships between the reissue delay and the number of common requests (also referred to herein as xe2x80x9ccollisionsxe2x80x9d) may be used, it is often desirable to in general increase the reissue delay as the number of collisions increase. Doing so permits requests associated with relatively fewer collisions to be issued more frequently than those associated with relatively more collisions. Given that requests associated with relatively fewer collisions typically have a greater likelihood of being granted (due to less address contention), greater bandwidth on the system bus is essentially reserved for such requests. A better balance between maintaining system bandwidth and minimizing individual request latencies thus results, and overall system performance is accordingly improved.
These and other advantages and features, which characterize the invention, are set forth in the claims annexed hereto and forming a further part hereof. However, for a better understanding of the invention, and of the advantages and objectives attained through its use, reference should be made to the Drawings, and to the accompanying descriptive matter, in which there is described exemplary embodiments of the invention.