Caching schemes have been employed by computer designers to reduce access times by a Central Processor Unit (CPU) to main memory, and hence, increase system performance. In many computing systems, main memory consists of a large array of memory devices with speeds which are slow relative to processor speeds. During accesses to main memory, the processor is forced to insert additional wait states to accommodate the slower memory devices. System performance during memory accesses can be enhanced with a cache. Smaller in size than main memory and significantly faster, the cache provides fast local storage for data and instruction code which is frequently used by the processor. In computing systems with caches, memory operations by the processor are first transacted with the cache. The slower main memory is only accessed by the processor if the memory operation cannot be completed with the cache. In general, the processor has a high probability of fulfilling a majority of its memory operations with the cache. Consequently, in computing systems which employ a cache, effective memory access times between a processor and relatively slow main memory can be reduced.
Caches can be highly optimized according to a number of different features. One important feature which affects cache performance and design complexity is the handling of writes by the processor or an alternate bus master. Because two copies of a particular piece of data or instruction code can exist, one in main memory and a duplicate copy in the cache, writes to either main memory or the cache can result in an incoherence between the two storage systems. For example, specific data is stored in a predetermined address in both the cache and main memory. During a processor write to the predetermined address, the processor first checks the contents of the cache for the data. Finding the data in the cache, the processor proceeds to write new data into the cache at the predetermined address. Because data is modified in the cache but not in main memory, the cache and main memory become incoherent. Similarly in systems with an alternate bus master, Direct Memory Access (DMA) writes to main memory by the alternate bus master modify data in main memory but not the cache. Again, the cache and main memory become incoherent.
An incoherence between the cache and main memory during processor writes can be handled with two techniques. In a first technique, a `write-through` cache guarantees consistency between the cache and main memory by writing to both the cache and main memory during processor writes. The contents of the cache and main memory are always identical, and so the two storage systems are always coherent. In a second technique, a `write-back` cache handles processor writes by writing only to the cache and setting a `dirty` bit to indicate cache entries which have been altered by the processor. When `dirty` or altered cache entries are later replaced, the modified data is written back into main memory.
Depending on which cache architecture is implemented, incoherency between the cache and main memory during a DMA read operation can be handled with bus watch or `snooping` techniques, by instructions executed by the operating system, or combinations thereof. In a `write-through` cache, no special techniques are required during the DMA read operation. In a `write-back` cache, bus snooping can be employed to check the contents of the cache for altered data, sourcing data from the cache to the requesting bus master when appropriate to maintain coherency. When the cache is sourcing data to the requesting bus master, main memory is prohibited from supplying data to the requesting bus master. Alternatively, the operating system can execute an instruction to WRITE `dirty` data from the cache into main memory prior to the DMA read operation. All `dirty` data is written out to main memory, thereby ensuring consistency between the cache and main memory.
Similarly during a DMA write operation, incoherency between the cache and main memory can be handled with bus `snooping` or monitoring, instructions executed by the operating system, or combinations thereof. In a `write-through` and a `write-back` cache, bus snooping invalidates cache entries which become `stale` or inconsistent with main memory following the DMA write operation. Additionally, cache PUSH and INVALIDATE instructions can be executed by the operating system prior to the DMA write operation, to WRITE `dirty` or altered data out to main memory, and to invalidate the contents of the entire cache. Since only a single copy of data exists in main memory following the instructions, the DMA write to main memory will not present the problem of possibly `stale` data in the cache.
In virtual memory systems, data is often transferred between memory and non-volatile storage devices, such as a disk, during page-out/page-in sequences, data is transferred from memory and stored on disk, while during a page-in sequence, data is transferred from the disk and stored in memory. For example, page-out/page-in sequences can occur during context switches or during extensive data manipulation.
A number of methods exist for ensuring coherency between write-back caches and main memory during page-out/page-in sequences initiated by alternate bus masters. In a first known technique, the bus is not snooped during either the page-out operation or the page-in operation. Instead, the operating system executes PUSH and INVALIDATE instructions prior to the page-out operation. As discussed hereinabove, the PUSH instruction forces the write-back cache to search all cache entries for `dirty` data which the pending page-out operation may access, and to copy these entries back into main memory. The INVALIDATE instruction marks data in the write-back cache which may be accessed by the page transfer as invalid. The DMA page transfer from memory to disk is performed after the execution of the two instructions, followed by a second DMA transfer from disk to memory corresponding to the page-in operation. No snooping is required during the DMA page-out operation because the write-back cache is coherent with main memory following the cache PUSH instruction. Likewise, no snooping is required during the DMA page-in operation because cache entries corresponding to the page transfer have been marked invalid, and hence, will not become `stale` or inconsistent with the new page in main memory.
Although the first known technique for maintaining cache coherency during page-out/page-in sequences is simple to implement, the technique displays a number of disadvantages. Most importantly, the processor spends a large amount of time during the execution of the required cache PUSH and INVALIDATE instructions, sequencing through the cache in search of `dirty` cache entries. For the duration of the instruction, the processor cannot run another task or process, and hence, this time is lost. Moreover, the processor must additionally interface with slow main memory in order to write altered cache entries back into main memory.
In a second known technique for ensuring cache coherency during page-out/page-in sequences, the bus is snooped only during the page transfer from memory to disk, with `dirty` data being sourced from the write-back cache to the requesting bus master when appropriate to maintain coherency. Dirty data is left unaltered in the cache following the page-out operation. After the page transfer is completed, the operating system executes a cache INVALIDATE instruction before initiating the pending DMA page transfer of data from disk to memory. The INVALIDATE instruction is executed for substantially the same reasons as in the first known technique to prevent data in the write-back cache from becoming `stale` or inconsistent with main memory during the page-in operation.
In a third known technique for ensuring cache coherency during page-out/page-in sequences, the data bus is snooped during both the page transfer from memory to disk and the page transfer from disk to memory. For substantially the same reasons as in the second known technique, `dirty` data is sourced from the write-back cache to the requesting bus master when appropriate to maintain coherency during the page-out operation. The data bus is additionally snooped during the page-in operation, with cached entries invalidated to prevent data from becoming inconsistent with main memory.
Although the second and third techniques for ensuring cache coherency during page-out/page-in sequences provide better performance than the first technique, nevertheless, these last two techniques have a number of drawbacks. The second technique still utilizes an operating system INVALIDATE instruction which may require an inordinately large amount of time to sequence through all cache entries. The third technique improves upon the second technique by snooping during the page-in sequence in order to obviate the required execution of an INVALIDATE instruction. Despite this fact, each `dirty` data location in the write-back cache must be accessed twice in the third known technique; once during the page-out operation to source `dirty` or altered data to the requesting bus master, and a second time during the page-in operation to invalidate `stale` or inconsistent data. Overall system performance can be increased if the number of accesses to the write-back cache can be minimized.