This invention relates generally to cache replacement algorithms in computer systems. More particularly, this invention relates to a method and apparatus for minimizing the cache miss rate as cache data entries from input/output (IO) devices replace other entries in a cache.
A cache is the name generally given to the first level of memory hierarchy in a computer system encountered by an address once it leaves the CPU of a data processor. Caches store recently-used data and instructions (xe2x80x9cdataxe2x80x9d), thereby reducing the time otherwise required to access the data from slower, lower levels of the hierarchy such as main memory or secondary storage.
Caches contain a number of block frames, each of which can store a memory block (a group of data bytes), and can be organized in a number of ways, such as direct mapped, fully associative, and set associative. In a direct-mapped cache, a memory block can only be stored in a specific block frame in the cache. In a fully associative cache, a memory block can be stored in any block frame in the cache. A set associative cache, which has features of both direct-mapped and fully associative caches, allows a memory block to be stored anywhere within a set of block frames within the cache. The number of block frames within a set are known as ways. A four-way set associative cache, for example, has four block frames per set.
Caches improve system performance because of the xe2x80x9cprinciple of locality,xe2x80x9d which holds that most computer programs do not access all data uniformly. Instead, most programs re-use the same data they have used recently. A cache stores this recently-used data in high-speed memory so it is available for re-use. The data processor then first looks to the cache for data before seeking it elsewhere. If the data sought is stored in the cache, a cache hit occurs and the data processor accesses it from the cache. If the data is not stored in the cache, a cache miss occurs and the data processor proceeds through the memory hierarchy to fetch the data from another source. In the process of fetching the data for the processor, a copy is stored in the cache for possible re-use.
A cache, however, has a finite size and at some point will completely fill with entries. On the next cache miss, the cache controller must then select an existing cache entry to be replaced by the newly sought data. The method for replacing entries is referred to as a cache replacement algorithm. Well known replacement algorithms include random and least-recently-used (LRU) methods.
While these and other known replacement algorithms work in most circumstances, they do not distinguish data transfers by requestor type (CPU or IO agent). Where such algorithms are used, they will replace currently cached non-IO data with IO data as significant IO data is transferred through the cache. The cache hit rate for non-IO requesters of the cache, such as data processors, then falls significantly. The cache, in effect, becomes unavailable to processors during the IO data transfer. For example, when a processor requests a desired block that has been replaced by an IO request, the cache has a miss. It must then replace the IO-requested block (or another) in the set to again make room for the processor-requested block. That cache miss would have been a cache hit had the IO request not earlier replaced the processor-requested block. Thus computer system performance suffers as the processors must first seek their data from elsewhere in the memory hierarchy than the cache, such as in main memory or secondary storage.
An objective of the invention, therefore, is to provide a method for replacing IO and non-IO data in a cache with a minimal reduction of system performance.
A method of replacing data entries in an associative cache with new data comprises the following steps. Data entries in the ways of the cache are checked to determine which, if any, was last accessed by an IO agent. If such a data entry is identified, the checked data entry is replaced with the new data. If no such data entry is identified, then a data entry is selected according to another cache replacement algorithm such as random or least-recently-used and the selected data entry is replaced.
In one aspect of the invention, the data entries are first checked to determine if any is invalid and, if so, the invalid entry is replaced with the new data before checking for a data entry having an IO state.
In another aspect of the invention, each data entry corresponds to a remote cache tag and the step of checking the data entry comprises checking a state tag portion of the cache tag. The state tag portion may comprise an n-bit tag.
The foregoing and other objects, features, and advantages of the invention will become more apparent from the following detailed description of a preferred embodiment which proceeds with reference to the following drawings.