1. Field of the Invention
The invention relates to cache memory architecture particularly with respect to processor private cache in a digital computer system.
2. Description of the Prior Art
Present day digital computer architectures often include interconnected subsystems comprising a plurality of central processor modules, a main memory subsystem and one or more I/O subsystems. The central processor modules, main memory and I/O subsystems preferably intercommunicate by a time-shared bus system intercoupling the component sections of the computer system. In this architecture, each central processor module may include a private cache into which the processor copies words from main memory utilizing the cache in performing its processes. For example, a processor may copy program instructions and data from main memory to its cache and, thereafter, execute the program task from cache. As is appreciated, cache is used in this manner to enhance performance. The cache memory is significantly faster than main memory and the processor with the cache avoids going back and forth on the bus to main memory for each instruction. The close proximity of the cache memory to the processor results in fast data accessing by the processor. Instead of being burdened by the slow data retrieval normally associated with accessing main memory, the processor can receive a copy of the data held by the faster cache.
Cache memories are generally smaller than main memory therefore holding only a subset of the main memory data. All of the main memory addresses, accordingly, are mapped into the smaller Cache memory.
When the processor requests a word from memory, the cache is addressed to determine if a copy of the data resides therein. If the cache is storing the data, the processor receives a cache HIT indication and the data is transferred from the cache to the processor. If the data is not present in the cache, a cache MISS occurs and main memory is accessed for the data word which is transmitted to the processor across the system buses. A copy of the "missed" data word is also transmitted to the cache memory and stored therein.
It is important in such systems to maintain cache consistency. The data used by a processor from its cache memory must be coherent and updated with respect to the corresponding data in main memory. All copies of information at a specific address in all of the memory facilities must be maintained identical. For example, if a first one of the processors executes a write to memory overwriting a main memory location that had been copied by a second one of the processors into its cache, the data in that location of the cache of the second processor becomes obsolete and invalid.
Computer systems with cache memories maintain data integrity by using a cache invalidation process. The process involves each cache system monitoring, or spying upon, the memory operations of the other processors and subsystems in the computer. This is conventionally accomplished by monitoring the memory write operations on the bus. When a memory write operation is detected, each cache memory system must, at some time, execute an internal cache invalidation operation or cycle. The cache invalidation cycle involves testing the contents of the cache for the specific address of the write operation that was detected. If the cache memory system determines that it contains this address, the system marks the address as invalid. When the processor attempts to access data from an invalid cache address, a cache MISS is returned to the processor and the contents of the invalid cache location is updated from main memory.
Computer systems of the type described often perform block write operations that overwrite a block of main memory words where the block may comprise, for example, four locations. The block is designated by a block address. When the spy mechanism of a cache invalidation system detects a block address and a HIT is indicated, the invalidation system must mark all of the cache addresses corresponding to the individual address locations within the block as invalid.
Generally, the cache memory system comprises cache data RAMs for storing the main memory data and a cache controller to manage processor requests and invalidation operations. The cache controller utilizes a system of tag RAMs, comprising a directory, to determine if a particular address is present in the cache. The tag RAMs are utilized both for processor data read cycles and invalidation cycles. Generally, the lower portion of the main memory address is utilized to address the tag RAMs and the upper portion of the main memory address is stored at the addressed tag RAM location. A validity bit is stored with the upper address portion denoting if the cache location is valid or invalid. Thus, by addressing the tag RAMs with the lower portion of main memory address and comparing the upper portion thereof with the stored upper address portion, a cache HIT with respect to a main memory address is uniquely determined.
In computer-systems of the type described, failure of the cache memory system of a processor required the processor to be shut down with the computing system load carried by the remaining processors on the bus. It was necessary to disable the processor in which the cache failed because the performance of the processor without the assistance of the cache would be too slow to be compatible with operation of the remainder of the system. This resulted in a system performance degradation approximately proportional to the processing load normally carried by the disabled processor. For example, in a system with two processors, a 50% degradation in system performance may result.
A concept recognized in the prior art is that of fault tolerance. A component is characterized as fault tolerant if the component is designed to continue operation although a component fault or failure has occurred. Normally, the fault is a hardware failure. Key elements of a system may be designed to be fault tolerant to achieve a high level of system availability. Typically, a component achieves fault tolerance by utilizing an identical standby redundant component. If the on-line component fails, the standby redundant component is brought on-line to continue the performance of the failed component.
Theoretically, the cache memory system could be rendered fault tolerant by utilizing a back-up redundant off-line cache memory system. As a practical matter, cache memory systems usually do not have redundant back-ups since such systems are very expensive and occupy a significant amount of space on the system printed circuit boards. Additionally, this approach is costly in that the redundant cache system is normally idle. The redundant cache system is merely occupying valuable printed circuit board real estate without contributing to normal system performance. The function of the redundant system is to continue system availability at the normal performance level in the event of failure of the active cache.
In addition to the described disadvantages, the prior art cache architectures are undesirably slow in that when the system is performing one of the operations of servicing the processor or executing invalidations, the other operation cannot be performed. When the cache system is performing one operation, it is busy to the other. The above-described prior art cache system architectures are not readily configurable for simultaneously performing invalidations and servicing the processor. Additionally, when a block address is detected for invalidation, the addresses comprising the block must be sequentially invalidated thereby requiring an undesirably long time to complete the process.
A fault tolerant cache memory system is disclosed in U.S. Pat. No. 4,905,141, issued Feb. 27, 1990. The system of said U.S. Pat. No. 4,905,141 utilizes multiple cache partitions operating independently and in parallel where any request address can be connected to any or all partitions. A two-level global and local search is performed involving partition look-aside tables and partition directories. If a cache partition fails, the partition is decoupled from Service and the cache continues to operate with degraded capacity.
It is expected that the HIT ratio of the system of said U.S. Pat. No. 4,905,141 will degrade when cache partitions fail since the number of directory sets into which address requests can be mapped has diminished, i.e., the set associativity of the degraded cache has been reduced. The system of said U.S. Pat. No. 4,905,141 tends to utilize an excessive amount of cache memory, replicated logic, and complex control resources, including complex address and data switching, thereby suffering from the disadvantages discussed above. The two-level search procedure utilized is undesirably time consuming. Because of the architecture of the cache control logic of said U.S. Pat. No. 4,905,141, the cache memory system thereof cannot simultaneously service processor request addresses and invalidation addresses thereby suffering from disadvantages of prior art systems described above. Similarly, the cache memory system of said U.S. Pat. No. 4,905,141 cannot simultaneously perform invalidations on multiple addresses; e.g., multiple addresses of a block, again suffering from the above-described disadvantages.