The invention is generally related to computers and other data processing systems, and in particular to the scheduling of transactions between source and destination units in a data processing system.
Computer technology continues to advance at a remarkable pace, with numerous improvements being made to the performance of both microprocessorsxe2x80x94the xe2x80x9cbrainsxe2x80x9d of a computerxe2x80x94and the memory that stores the information processed by a computer.
In general, a microprocessor operates by executing a sequence of instructions that form a computer program. The instructions are typically stored in a memory system having a plurality of storage locations identified by unique memory addresses. The memory addresses collectively define a xe2x80x9cmemory address space,xe2x80x9d representing the addressable range of memory addresses that can be accessed by a microprocessor.
Both the instructions forming a computer program and the data operated upon by those instructions are often stored in a memory system and retrieved as necessary by the microprocessor when executing the computer program. The speed of microprocessors, however, has increased relative to that of memory devices to the extent that retrieving instructions and data from a memory can often become a significant bottleneck on performance. To decrease this bottleneck, it is desirable to use the fastest available memory devices possible, e.g., static random access memory (SRAM) devices or the like. However, both memory speed and memory capacity are typically directly related to cost, and as a result, many computer designs must balance memory speed and capacity with cost.
A predominant manner of obtaining such a balance is to use multiple xe2x80x9clevelsxe2x80x9d of memories in a memory system to attempt to decrease costs with minimal impact on system performance. Often, a computer relies on a relatively large, slow and inexpensive mass storage system such as a hard disk drive or other external storage device, an intermediate main memory that uses dynamic random access memory devices (DRAM""s) or other volatile memory storage devices, and one or more high speed, limited capacity cache memories, or caches, implemented with SRAM""s or the like. One or more memory controllers are then used to swap the information from segments of memory addresses, often known as xe2x80x9ccache linesxe2x80x9d, between the various memory levels to attempt to maximize the frequency that requested memory addresses are stored in the fastest cache memory accessible by the microprocessor. Whenever a memory access request attempts to access a memory address that is not cached in a cache memory, a xe2x80x9ccache missxe2x80x9d occurs. As a result of a cache miss, the cache line for a memory address typically must be retrieved from a relatively slow, lower level memory, often with a significant performance hit.
Another manner of increasing computer performance is to use multiple microprocessors operating in parallel with one another to perform different tasks at the same time. Often, the multiple microprocessors share at least a portion of the same memory system to permit the microprocessors to work together to perform more complex tasks. The multiple microprocessors are typically coupled to one another and to the shared memory by a system bus or other like interconnection network. By sharing the same memory system, however, a concern arises as to maintaining xe2x80x9ccoherencexe2x80x9d between the various memory levels in the shared memory system xe2x80x94that is, ensuring that there are not multiple modified copies of any particular data in the system.
For example, in a given multi-processor environment, each microprocessor may have one or more dedicated cache memories that are accessible only by that microprocessor, e.g., level one (L1) data and/or instruction cache, a level two (L2) cache, and/or one or more buffers such as a line fill buffer and/or a transition buffer. Moreover, more than one microprocessor may share certain caches as well. As a result, any given memory address may be stored from time to time in any number of components in the shared memory system.
Coherency is maintained in many systems by maintaining xe2x80x9cstatexe2x80x9d information that indicates the status of the data stored in different components of a system. Often, this information is stored locally with each component. Furthermore, to reduce the amount of state information, multiple memory addresses are often grouped together into lines or blocks having a common state.
As an example, many systems utilize a MESI coherence protocol that tags data stored in a component as one of four states: Modified, Exclusive, Shared, or Invalid. The modified state indicates that valid data for a particular group of memory addresses is stored in the component, and the component has the most recent copy of the dataxe2x80x94i.e., all other copies, if any, are no longer valid. The Exclusive state indicates that valid data for a particular group of memory addresses is stored solely in the component, but the data has not been modified relative to the copy in the shared memory. The Shared state indicates that the valid data for a particular group of memory addresses is stored in the component, but that other valid copies of the data also exist in other components, including the main memory. The Invalid state indicates that no valid data for a particular group of memory addresses is stored in the component, although valid data may be stored in the main memory.
In many conventional implementations, accesses to memory addresses in a shared memory system are handled via transactions, which are typically packets of information transmitted from a source unit to a destination unit to perform a predetermined operation. As one example, separate request and response transactions may be used to maintain cache coherency and initiate the transfer of data between the different components in a system. A request transaction may be initiated by a source unit such as a microprocessor to request an access to data stored at a particular memory address, e.g., a load or read request or a store or write request. One or more destination units, e.g., another microprocessor, a cache and/or a system bus interface unit, receive and process the request. Each destination unit then functions as a source unit by issuing a response transaction back to the original source unit, typically indicating, based upon the state information for the requested memory address, whether or not the requested data is allocated to that unit. Also, if the requested data is allocated to that unit, the data is typically returned to the requesting unit in the response transaction. Furthermore, often the state information for each component in the system is updated in response to the operation being performed.
One difficulty that arises with transaction-based shared memory systems is that with multiple source and destination units, multiple transactions may need to be transmitted and processed at any given time across the interface between the different units. As a result, some mechanism to schedule transactions is typically required.
Conventional scheduling mechanisms typically implement some form of fairness algorithm, e.g., where transactions are transmitted and processed on a first-come, first-served basis, and where transactions that arrive at the same time are scheduled in a round-robin or random fashion. No explicit prioritization, except temporal, is typically utilized in scheduling transactions.
While a purely fair algorithm ensures that all transactions are eventually handled in a shared memory system, in many instances such an algorithm offers only moderate performance. As a result, a need has arisen for an improved scheduling algorithm that offers improved performance over conventional implementations.
The invention addresses these and other problems associated with the prior art by providing a data processing system, circuit arrangement, and method that rely on state information to prioritize certain transactions relative to other transactions when scheduling transactions in a data processing system. Based upon the particular implementation, prioritizing a particular type of transaction associated with a particular state relative to different transactions can reduce latency relative to simple fairness algorithms, thereby improving overall system performance.
A transaction scheduler consistent with the invention is configured to schedule the transmission of a first transaction from a source unit to a destination unit by prioritizing the first transaction relative to a second transaction based upon state information associated with at least one of the first and second transactions. The state information can be associated with a particular transaction based upon the current state of the data that is the focus of the transaction, and/or based upon the future state of the data that is to occur as a result of the transaction. Furthermore, in some implementations, the state information need not be the sole factor considered by a transaction scheduler. Instead, additional considerations such as fairness may also be considered, e.g., to ensure forward progress of all transactions.
State-based transaction scheduling consistent with the invention may be utilized in a number of applications. For example, it has been found that in many shared memory systems, cached data having a modified state is accessed more frequently than cached data having a non-modified state, e.g., an exclusive or shared state. As a result, by prioritizing transactions associated with modified cached data, the more frequent modified transactions are made more readily available, which often results in such transactions being handled more quickly and with reduced latency. Although such prioritization may also result in an increase in the latency for less frequent non-modified transactions, the overall transaction latency for the system is typically reduced due to the greater frequency of the prioritized modified transactions. Other potential applications for state-based transaction scheduling will become more apparent from a reading of the disclosure presented herein.
These and other advantages and features, which characterize the invention, are set forth in the claims annexed hereto and forming a further part hereof. However, for a better understanding of the invention, and of the advantages and objectives attained through its use, reference should be made to the Drawings, and to the accompanying descriptive matter, in which there is described exemplary embodiments of the invention.