1. Field of Invention
The present invention pertains to the field of computer systems. More particularly, this invention relates to cache management in a multi-threaded processor.
2. Art Background
Computer systems typically include a processor and a memory hierarchy. The memory hierarchy usually includes a main memory that holds instructions and data for the processor. Typically, the processor fetches instructions from the main memory, reads data associated with the instructions from the main memory, executes the instructions, and writes result data back into the main memory.
In addition, the memory hierarchy of a computer system typically includes one or more cache memories. For example, a computer system may include a primary cache which is also known as a level one (L1) cache. The primary cache is usually tightly integrated with the processor and may be contained on the same integrated circuit as the processor. A computer system may also include a secondary cache which is also known as a level two (L2) cache. The secondary cache is usually further down the memory hierarchy between the primary cache and the main memory.
A typical cache memory is a relatively small and fast memory that holds blocks of instructions and/or data obtained from the main memory. A block of instructions or data held in a cache memory may be referred to as a cache line or a data line. A cache memory usually provides a processor with relatively fast access to data lines contained therein in comparison to the time required to obtain the same data line from the main memory. As a consequence, a cache memory if managed efficiently can greatly increase the throughput of a processor by providing fast access to instructions and/or data.
Typically, a processor obtains a particular data line by issuing an address for the particular data line. The primary cache usually performs a lookup operation in response to the address to determine whether the particular data line is contained therein. If the particular data line is not contained in the primary cache, a condition known as a cache miss to the primary cache, then the address of the particular data line is propagated down to a lower level of the memory hierarchy. This usually results in a lookup operation in the secondary cache or a read memory operation in the main memory. In either case, the particular data line is eventually returned from the lower level of the memory hierarchy and it is usually placed into the primary cache. This process may be referred to as a cache fill operation and the particular data line may replace another data line already stored in the primary cache.
In addition, the processor employed in a computer system may be a multi-threaded processor. A multi-threaded processor is a processor that switches execution among multiple threads. A thread may be defined as a stream of addresses associated with the instructions and data of a particular sequence of code that has been scheduled within the processor.
One advantage of a multi-threaded processor is that it can switch threads and continue instruction execution during a long latency operation such as a cache fill operation. This usually provides an overall increase in throughput particularly when a missing data line must be obtained from the main memory.
Nevertheless, conditions may exist in a computer system having a multi-threaded processor that cause the primary cache to be largely overrun by the data lines associated with a particular thread. Such a condition may be referred to as cache pollution and may slow the execution of threads other than the particular thread.
For example, consider a multi-threaded processor that switches between threads A and B. Now consider that a cache miss to the primary cache occurs during execution of thread A, and that a cache miss to the secondary cache occurs, and that the missing data line must be obtained from the main memory during a cache fill operation. Now consider that during the cache fill operation, which typically takes a relatively long time in comparison to the speed of the primary and secondary caches, the processor begins executing thread B. Consider also that thread B happens to be associated with data lines contained in the secondary cache. Under such conditions, the primary cache can become polluted with the data lines associated with thread B as misses to the primary cache cause large numbers of data lines of the primary cache to be replaced with data lines obtained from the secondary cache during execution of thread B.
Unfortunately, this has the consequence of subsequently causing higher numbers of primary cache misses for thread A which further decreases the throughput for thread A. Moreover, prior primary caches are not managed in such a way as to avoid such cache pollution or to provide a balance between the throughput obtained by the different threads in a multi-threaded processor.
A method and apparatus is disclosed which provides a cache management policy for use with a cache memory for a multi-threaded processor. The cache memory is partitioned among a set of threads of the multi-threaded processor. When a cache miss occurs, a replacement line is selected in a partition of the cache memory which is allocated to the particular thread from which the access causing the cache miss originated, thereby preventing pollution to partitions belonging to other threads. The partitioning may be static. Alternatively, the partitioning may be dynamic and may be used to control relative throughput associated with the threads.
Other features and advantages of the present invention will be apparent from the detailed description that follows.