The development of data processing systems has brought with it the demand for higher speed computers such that these computers can access, process, and output data with greater proficiency. Modern day computer systems frequently comprise a central processing unit (CPU) and a memory hierarchy including a relatively large, albeit slow, main memory module and a smaller, but faster, cache memory. In such systems, the cache memory is physically situated between the central processing unit and the main memory module, as a temporary storage device for current data and instructions being processed by the central processing unit. The use of a relatively fast cache memory device as a temporary storage medium allows for an overall increase in computer system speed.
The use of a cache memory is based upon the principles of temporal locality and spatial locality. More specifically, when a CPU is accessing data and instructions from a particular space within physical memory, it will most probably access the data and instructions from that space and also, access data and instructions from contiguous space, for a certain period of time. Accordingly, data blocks within the contiguous space of physical memory where data being utilized by the central processing unit resides, are placed in the cache memory to greatly decrease the time required to fetch data and instructions from frequently referred to data items within such data blocks.
Accessing data in a memory has been a notorious source of computer latency dependent upon the type of memory employed. The inherent latency of memory systems results from the process of indexing a particular data item within a data block within a memory system, and then accessing that same data item when such is required by the system.
A common method of accessing a particular data item within a data block in a cache memory has been through a direct-mapped cache memory system, wherein each particular data item stored in the cache memory is located by an index comprising a predetermined number of bits of its main memory address (usually some set of low order bits). Accordingly, when a particular data item is required for processing, the index is used to fetch the data item from the cache memory.
An alternative to the direct-mapped system for a computer cache memory, is a set-associative cache memory system, which comprises a set of cache data RAMS for data storage and a corresponding set of tag RAMS for storage of tags corresponding to the main memory addresses of the data items stored in the data RAMS.
A particular data item can be stored in any one of the set of data RAMS. Each data RAM is paired with one of the tag RAMS for storage of the tags corresponding to the main memory addresses of the data items stored in the respective data RAM. The location of the particular data item within a data RAM is identified by an index derived from the data item's main memory address, as in a directly mapped cache.
When the computer system wants to fetch the particular data item, the index is input into each data RAM/tag RAM pair. Each data RAM/tag RAM pair outputs a data item and its respective tag. At the same time, the tag of the main memory address for the particular data item to be fetched is input to comparison logic for comparison with each of the tags output by the tag RAM's. Assuming that the data item to be fetched is in one of the data RAMS, the tag output by the tag RAM paired to that data RAM where the particular data item resides will match the tag of the data item input to the comparison logic and the comparison logic will output the data item from that data RAM.
Each of the known cache memory systems has specific benefits and known disadvantages. For example, a direct mapped system for a computer cache memory is known to be relatively fast in fetching data corresponding to a specific main memory address. Although a direct mapped system includes comparison logic to determine if the data item selected is contained in the cache, such comparison logic is for a comparison of a single address to a single tag. Thus, the data item is available for use by the CPU prior to completion of the tag comparison making the direct mapped system faster than a set-associative system. A direct-mapped system will, however, always write over data with the same index associated with it, resulting in a lower hit rate for data fetches from cache memory.
A set-associative cache memory system conversely, has a higher hit rate because each data block, containing data items, stored in the set associative cache is placed in only one of a set of data RAMs and a replacement algorithm can be used to make certain that subsequent data blocks, having the same index, are placed in data blocks not recently accessed or even in a random location. However, the need to wait for the comparison logic to determine which one of the set of data RAMs contains a particular data item makes the set-associative cache memory system a relatively slow system compared to a direct mapped system.
Other than its use of comparison logic, the problem with the set-associative scheme is that it needs to have the tag RAMS near the cache data RAMS in order to select the correct data from the possibilities supplied by the set of data RAM's. Single chip microprocessors employing on-board RAM are disadvantaged by this proximity requirement as valuable chip area is needed by the tag RAMs. Such chip area could be utilized to implement a larger data cache, thereby increasing the amount of data that can be stored for improved cache performance.
With regard to the speed and performance of the computer cache memory, it is desirable to achieve a system configuration which combines the speed of the direct-mapped system and the high hit rate obtainable through the use of a set-associative cache memory system. Ideally then, the speed of a direct-mapped system would have to be combined with the performance (hit rate) of the set-associative system.