1. Field of the Invention
This invention relates to computer systems that employ integrated processors having cache memory subsystems. The invention also relates to bus transfer mechanisms employed within integrated processors.
2. Description of the Relevant Art
Cache-based computer architectures are typically associated with specialized bus transfer mechanisms to support efficient utilization of the cache memory and to maintain data coherency. A cache memory is a high-speed memory unit interposed in the memory hierarchy of a computer system between a slower system memory and a processor to improve effective memory transfer rates and accordingly improve system performance. The name refers to the fact that the small cache memory unit is essentially hidden and appears transparent to the user, who is aware only of a larger system memory. The cache is usually implemented by semiconductor memory devices having speeds that are comparable to the speed of the processor, while the system memory utilizes a less costly, lower-speed technology. The cache concept anticipates the likely reuse by the microprocessor of selected data in system memory by storing a copy of the selected data in the cache memory.
A cache memory typically includes a plurality of memory sections, wherein each memory section stores a block or a "line" of two or more words of data. For systems based on the particularly popular model 80486 microprocessor, a line consists of four "doublewords" (wherein each doubleword comprises four 8-bit bytes). Each line has associated with it an address tag that uniquely identifies which line of system memory it is a copy of. When a read request originates in the processor for a new word (or a new doubleword or a new byte), whether it be data or instruction, an address tag comparison is made to determine whether a copy of the requested word resides in a line of the cache memory. If present, the data is used directly from the cache. This event is referred to as a cache read "hit". If not present, a line containing the requested word is retrieved from system memory and stored in the cache memory. The requested word is simultaneously supplied to the processor. This event is referred to as a cache read "miss".
In addition to using a cache memory to retrieve data, the processor may also write data directly to the cache memory instead of to the system memory. When the processor desires to write data to memory, an address tag comparison is made to determine whether the line into which data is to be written resides in the cache memory. If the line is present in the cache memory, the data is written directly into the line. This event is referred to as a cache write "hit". As will be explained in greater detail below, in many systems a data "dirty bit" for the line is then set. The dirty bit indicates that data stored within the line is dirty (i.e., has been modified), and thus, before the line is deleted from the cache memory or overwritten, the modified data must be written into system memory.
If the line into which data is to be written does not exist in the cache memory, the line is either fetched into the cache memory from system memory to allow the data to be written into the cache, or the data is written directly into the system memory. This event is referred to as a cache write "miss". A line which is overwritten or copied out of the cache memory when new data is stored in the cache memory is referred to as a victim block or a victim line.
Cache memories can be optimized according to a number of different techniques. One aspect that affects system performance and design complexity is the handling of writes initiated by the processor or by an alternate bus master. As explained previously, because two copies of a particular piece of data or instruction code can exist, one in system memory and a duplicate copy in the cache, writes to either the system memory or the cache memory can result in an incoherence between the two storage units. For example, consider the case in which the same data is initially stored at a predetermined address in both the cache memory and the system memory. If the processor subsequently initiates a write cycle to store a new data item at the predetermined address, a cache write "hit" occurs and the processor proceeds to write the new data into the cache memory at the predetermined address. Since the data is modified in the cache memory but not in system memory, the cache memory and system memory become incoherent. Similarly, in systems with an alternate bus master, write cycles to system memory by the alternate bus master modify data in system memory but not in the cache memory. Again, the cache memory and system memory become incoherent.
An incoherence between the cache memory and system memory during processor writes can be prevented or handled by implementing one of several commonly employed techniques. In a first technique, a "write-through" cache guarantees consistency between the cache memory and system memory by writing the same data to both the cache memory and system memory. The contents of the cache memory and system memory are always identical, and thus the two storage systems are always coherent. In a second technique, a "write-back" cache handles processor writes by writing only to the cache memory and setting a "dirty" bit to indicate cache entries which have been altered by the processor. When "dirty" or altered cache entries are later replaced during a "cache replacement" cycle, the modified data is written back into system memory.
An incoherence between the cache memory and system memory during a write operation by an alternate bus master is handled somewhat differently. For a system that employs write-back caching, one of a variety of bus monitoring or "snooping" techniques may be implemented to determine whether certain lines of data within the cache memory should be invalidated or written-back to system memory when the alternate bus master attempts to write data to system memory. One such technique implemented within 80486-based systems is referred to as the "MESI" protocol. For systems that employ the MESI protocol, when an alternate bus master attempts to write data to system memory, a cache controller determines whether a corresponding line of data is contained within the cache memory. If a corresponding line is not contained by the cache memory, no additional action is taken by the cache controller, and the write cycle initiated by the alternate bus master is allowed to complete. If, on the other hand, a corresponding line of data is contained in the cache memory, the cache controller determines whether that line of data is dirty or clean. If the line is clean, the line is marked invalid by the cache controller and the transfer of data from the alternate bus master into system memory is allowed to complete. The line of data must be marked invalid since the modified (and thus the most up-to-date) data is now contained only within the system memory (following completion of the write cycle by the alternate bus master). If the line of data is instead dirty, a snoop write-back cycle is initiated by the cache controller which causes the alternate bus master to "back-off" and release mastership of the system bus. The cache controller then causes the entire line of dirty data within the cache memory to be written back into system memory. The snoop write-back cycle may be accomplished by executing a burst write cycle to system memory. As is well known to those of skill in the art, during the data phase of a burst cycle, a new word (or doubleword) may be written to the system memory for each of several successive clock cycles without intervening address phases. The fastest burst cycle (no wait states) requires two clock cycles for the first word (one clock for the address, one clock for the corresponding word), with subsequent words written to sequential addresses on every subsequent clock cycle. When the cache controller finishes the dirty line write back, the line is marked clean (unmodified).
After the snoop write-back cycle completes, the alternate bus master re-obtains mastership of the system bus, and the write cycle by the alternate bus master is again executed. At this point, the new data is allowed to be written into the system memory. The cache controller observing the write by the alternate bus master to the now clean memory location in system memory, now invalidates the line in the cache as previously described. It is noted that the snoop write-back cycle ensures that data coherency is maintained even if the writing of data from the alternate bus master does not involve an entire cache line.
An incoherence between the cache memory and the system memory during a read operation by an alternate bus master is treated similarly. When an alternate bus master attempts to read data from system memory, the cache controller determines whether a corresponding line of data is contained within the cache memory. If a corresponding line is contained by the cache memory, and if the corresponding line is dirty, a snoop write-back cycle is initiated by the cache controller which causes the alternate bus master to back-off and release mastership of the system bus. The cache controller then causes the entire line of dirty data within the cache memory to be written back into system memory. After the write-back cycle completes, the alternate bus master re-obtains mastership of the system bus, and the read cycle by the alternate bus master is re-initiated. At this point, the data within the system memory is allowed to be read.
When the snoop write-back cycles as described above are executed to maintain data coherency during read and write operations of an alternate bus master, the bandwidth of the CPU local bus is degraded since the alternate bus master must wait for the write-back cycle to complete before performing its desired data transfer. It would therefore be desirable to provide a system wherein cache coherency is maintained while avoiding the necessity of snoop write-back operations.
In recent years, integrated processors have been developed to replace previously discrete microprocessors and associated peripheral devices within computer systems. An integrated processor is an integrated circuit that performs the functions of both a microprocessor and various peripheral devices such as, for example, a memory controller, a DMA controller, a timer, and a bus interface unit, among other things. The introduction of integrated processors has allowed for decreases in the overall cost, size, and weight of computer systems, and has in many cases accommodated improved performance characteristics of the computer systems. Integrated processors that implement a model 80486-compatible instruction set typically employ the MESI protocol consistent with their discreet component counterparts. This ensures backwards compatibility with peripherals such as memory controllers and I/O devices that were designed for use within 80486-based systems. Unfortunately, employment of the protocol described above within integrated processors limits the overall performance of the computer system since snoop write-back cycles must be accommodated, thus limiting the bandwidth of the CPU local bus. It is desirable to provide an integrated processor which is backwards compatible with 80486-type external peripherals while attaining high overall system performance.