This invention relates to cache memory and methods for accessing cache memory. More particularly, this invention relates to a hybrid cache system in which primary cache is accessed with a virtual index, secondary cache is accessed with a physical index and primary cache is maintained as a subset of secondary cache.
Memory access time is a factor which frequently limits host processor throughput. Accordingly cache memory often is implemented. Cache memory is an independent bank of high-speed memory enabling quick access to stored contents so as to improve computer performance. Such cache memory serves as a buffer between lower-speed main memory and the host processor.
Typically, a hierarchical structure of cache memories is implemented, including for example, primary cache and secondary cache. Primary cache typically is small having a fast access time. Secondary cache typically is larger and slower than primary cache, but smaller and faster than main memory. Primary cache may include a primary instruction cache and a primary data cache. Primary instruction cache stores instructions from main memory, while primary data cache stores data from main memory. Secondary cache serves as a backup in the event of a cache miss (e.g., instruction/data not present) in primary cache.
The use of separate primary caches for instructions and data enables the processor to access the primary instruction cache and primary data cache in parallel, thereby further improving host processor throughput. In addition, because the locality of a set of instructions or a set of data is much higher than that of a mixture of instructions and data, separate caches may result in less cache misses than a single shared cache.
FIG. 1 shows a conventional cache memory format. A cache 10 is formed by several lines 12. Each line 12 includes a tag 14 and a block 16 of words 18. A cache having 1024 lines with four eight-bit words per line forms a four kilo-byte ("4 KB") cache memory. For an instruction cache, each word corresponds to an instruction, while for a data cache, each word corresponds to a data item. As described, there is a tag 14 for each line 12. Thus, there is one tag 14 for each four word block 16 in the example recited above. To access a word 18 in cache 10, an index is used to select the line and word.
The source of such tag and index are used to characterize cache as either physical cache or virtual cache. For a physical cache, the tag and index are derived from the physical address of the contents sought. For a virtual cache, the tag and index are derived from the virtual address of the contents sought. Computer systems today are known to use physical addresses or virtual addresses. Computer systems using virtual addresses are often referred to as virtual machines and include a mapping function for mapping virtual addresses to the corresponding physical addresses. Virtual machines are known to use either physical cache or virtual cache.
FIG. 2 shows a block diagram which depicts the method of accessing a physical cache by a virtual machine. As shown, the virtual address is input to a translation look-aside buffer (TLB) 20 where it is translated into a physical address. The physical address then is parsed to form an index into the cache 22 to address the appropriate cache location. The tag for such location then is compared with another parse of the translated physical address. Accordingly, the translation and tag check are performed in series. If the tag and parse match, then there is a cache hit (e.g., cache contents are for the desired physical address). If the tag and parse do not match, then there is a cache miss (e.g., the desired physical address contents are not currently present or available in the cache). Because a physical cache on a virtual machine requires translation of the virtual address before accessing cache there is an undesirable delay added to the memory access procedure. Accordingly, there is a need for a cache for a virtual machine which enables optimal memory access times.
It also is known to perform an address translation in parallel with a cache access where the number of cache index bits are fewer than the number of page offset bits in main memory. The smaller sizing of the index, however, results in size restrictions. Accordingly, there is a need for a cache memory system having address translation and cache access performed in parallel in which the number of cache index bits may exceed the number of page offset bits.
With regard to a virtual cache, several problems arise which must be overcome to maintain a coherent cache system (e.g., the stored information is valid if indicated as valid). An example of a problem to be resolved is the resulting invalidation of virtual tags whenever the virtual address to physical address map changes. For a multi-process computer system, this may occur frequently. Typically, the cache is flushed in response to such a map change.
As another example, when a virtual index is used, several cache locations may correspond to the same physical address. Because virtual processes may include multiple virtual addresses which translate to the same physical location, more than one virtual index may result for the different virtual addresses. Thus, the cache may include multiple entries for the same physical location. Accordingly, management of multiple cache entries mapped to the same physical location may be necessary to maintain cache coherency.
Additional problems for implementing a virtual cache result from the common practice of using physical addresses on local buses. To access cache from such a bus (as opposed to from the host processor) requires a physical address. For accesses which are unrelated to an access by the host processor, the bus has no information for relating the physical address to a virtual address. A reverse map of physical addresses to virtual addresses could be implemented, although such an approach would be expensive. Accordingly, there is a need for a cache format and method of access which enables the physical address on the bus to be used for accessing cache memory.
The problem with accessing cache memory over the bus is particularly significant in multi-processor systems. For such systems, a snooper device often is used to monitor the bus transactions and aid in maintaining cache coherency. Conventionally, bus transactions are monitored for such systems so as to invalidate or update cache locations, when necessary. Thus, writing to a location in one cache may require invalidating or updating a corresponding location in another cache. Failure to maintain cache coherency results in a processor accessing old data already overwritten in main memory, but not yet updated in the local cache.
Accordingly, there is a need for a cache memory and method for accessing same, in which cache coherency is maintained and optimal access times are achieved.