Cache memories are small, high-speed stores that are frequently included in the CPU architectures of data processing systems. The storage unit of a cache is called a line which can hold a consecutive segment of data in the memory. When a CPU uses a piece of data, the cache is searched for the line containing the data. If the line is already in the cache, the piece of data is sent immediately to the CPU, otherwise the whole line will be loaded from the memory into the cache. By automatically maintaining recently used lines in the cache, an entire memory system of a computer can be made to appear as fast as the cache. An important measure of the performance of a cache memory is the Buffer Hit Ratio (BHR): the percentage of memory accesses that are satisfied by the cache without having to access slower main memory. Alternatively, one can examine the BHR's inverse, the Buffer Miss Ratio (BMR). Cache performance depends on the application code being run. In particular, the better the code exhibits "spatial locality", the more its references are to closely-spaced elements of its address space and thus the higher a BHR (or lower BMR) will be achieved.
Since a cache can contain thousands of lines, in order to reduce its search time, very often it is logically organized in a two-dimensional storage of rows and columns. In such a case, cache accesses are memory mapped, a consecutive segment of data from the memory that makes up a cache line is assigned uniquely to a row and each row has its own independent logic for controlling the line replacement. These rows, which are called congruence classes, allow any cache line to be accessed in a fixed amount of time. The disadvantage is when the congruence classes are not evenly utilized as a result of poor spatial locality of any running program, the hit ratio of a program can be decreased substantially. This is very likely a result of two factors. First, the number of congruence classes in a cache is always in the power of two, e.g., 128. Second, a running program usually accesses its data in a stride of a multiple of two, if not consecutive. These two factors can cause a small subset of the congruence classes in a cache to be heavily used while many others are unused. The purpose of the present invention is to alleviate such a problem.
A good survey on the organization, operation and design of cache memories can be found in A. J. Smith, "Cache Memories", ACM Computer Surveys, Vol. 14, 1982, pp. 473-530.
The present invention applies the pseudo randomization permutation scheme disclosed in previously referenced concurrently filed U.S. application Ser. No. 07/114,909 to the problem of deciding where in a set-associative cache to put data referenced by the code.
The present invention differs from that disclosed in the above referenced copending application in three principle ways. The first is that the current scheme applies to a cache storage which is a single physical device not multiple physical devices as in the previously referenced application. Secondly, while one of the main purposes of the application is to randomize the accesses from a plural number of sources, such as CPU's, among a plural number of devices, the present invention has for its primary purpose the randomization of accesses from one CPU to its own cache space. Finally, the directory look-up, to which the overall pseudo randomization scheme is applied herein, involves a key and an identifier (ID) for every cache access. This is specific to cache storage, and does not apply to memory or I/O devices.