1. Field of the Invention
The present invention relates to a cache memory in a computer system, and more specifically to an optimized cache tag structure associated with the amount of cache memory installed.
2. Art Background
Computer systems often utilize cache memory to store a subset of data contained in a larger main memory. The cache memory is implemented with high speed random access memories (RAMs) resulting in faster retrieval of data from the cache memory than from the system memory. However, system memory is generally not implemented entirely with the high speed memory because the high speed cache memory is more expensive than the larger main memory. Therefore, computer designers select a trade-off between the amount and the cost of the high speed cache memory.
Typically, in a cache memory system, a central processing unit (CPU), which requires data from system memory, first attempts to retrieve the data from cache memory. A cache tag structure is a mechanism that keeps track of the data that is present in the cache memory. Each piece of data present in the cache memory has an address tag which uniquely identifies the data. The cache tag typically has several fields stored in RAM's: the address tag, a block valid bit and sub-block valid bit, if sub-blocks are used. When the CPU attempts to access data in the cache memory, a portion of the address location of the data in main memory known as cache tags are compared with a portion of the main memory address of the data. The cache tags are compared by presenting main memory addresses to a RAM storing the cache tags. Normally, a valid bit is used in conjunction with the cache tags. If the cache tag matches the memory address and the valid bit is set, then the tag check operation is termed a "hit" and the piece of data in the location corresponding to the tag is sent to the CPU; otherwise it is a "miss" and the main memory supplies the data.
The entire main memory address, however, need not be stored as cache tags. Often, in order to minimize storage requirements, attempts to reduce the number of address tag bits are desirable. For example, it is redundant to store part of the address known as the "word component selector" as part of the address tag. These bits select both a block from the cache memory, and a tag from the cache tag structure. If the word component selector performs a one-to-one mapping function, it redundant to use these bits in the comparison operation. Therefore, the word component selector bits do not need to be stored as part of the tag.
In addition to storing cacheable data in blocks, grouping of consecutive pieces of cacheable data, called "sub-blocks", into blocks that share the same tag is also employed. The use of sub-blocks to store cache tags in the cache tag structure is a common technique for reducing cache tag storage requirements. A sub-block contains the smallest piece of data that can be brought into the cache. Typically, the number of sub-blocks that are grouped into blocks is constant, but the number of sub-blocks present within a block at any given time may vary. A field of bits, called "sub-block valid bits", tracks which sub-blocks of the block are valid. Typically, the number of sub-blocks within a block is fixed and does not vary.
When retrieval of data by the CPU results in a cache "miss", a replacement operation is performed in that data from the main memory is placed in the cache memory. Normally, in set associative caching, the cache memory is organized so as to map several main memory locations into one cache memory location. During the replacement operation, a new sub-block is retrieved from system memory and is stored in the cache memory. The address tags are also updated during a replacement operation. Whenever a block is replaced, one sub-block is marked valid and all other sub-blocks in the block are marked "not valid". Because the old address tag is lost, the replacement operation displaces all the valid sub-blocks of the previous block at that cache location with only one new sub-block.
Since the replacement operation tends to replace many old sub-blocks with one new sub-block, system performance suffers because replacement "pollutes" the cache. The replacement operation may adversely affect performance because by invalidating sub-blocks used recently by the processor, the likely-to-be-used data becomes polluted with the unlikely-to-be-used data. Cache strategies that use fewer sub-blocks are less prone to "pollution" during replacement. Because there is less sharing of tags by sub-blocks, fewer sub-blocks are marked invalid during replacement. This tradeoff between the economy of sub-block shared tags and system performance degradation from cache pollution is usually optimized and fixed for a system design.