The present invention relates in general to cache memory systems that are coupled to processors and more particularly to a cache memory system adapted to be coupled to a processor through a private bus.
FIG. 1 is a simplified block diagram of a computer 20 including a processor 22 and a memory system 24, in accordance with the prior art. The processor 22 is coupled to the memory system 24 through a system bus 26 that conveys data and addresses between system components. The computer 20 additionally includes a user input interface 34, such as a keyboard, mouse and the like, and a user output interface 36, such as a monitor, both coupled to the processor 22 through the system bus 26.
The processor 22 typically executes instructions read from the memory system 24 to operate on input data from the user input interface 34 and display results using the user output interface 36. The processor 22 also stores and retrieves data in the memory system 24.
The memory system 24 includes several different types of memory units. A read-only memory (xe2x80x9cROMxe2x80x9d) 39 storing instructions that form an operating system is often part of the memory system 24. Magnetic disc or other mass data storage systems 40 for nonvolatile storage of information that may be altered are also often part of the memory system 24. Mass data storage systems 40 are well adapted for storage and retrieval of large amounts of data, but are too slow to permit their effective usage in many applications. Dynamic random access memories (xe2x80x9cDRAMxe2x80x9d) 42 allow much more rapid storage and retrieval of data and are frequently used as xe2x80x9csystem memoryxe2x80x9d 38 in which data and instructions are temporarily stored. However, DRAMs used as system memory 38 generally do not have access times that allow the processor 22 to operate at full speed. For example, a DRAM 42, may have a data access time on the order of 100 nanoseconds, while the processor 22 may be able to operate with a clock speed of several hundred megahertz. As a result, the processor 22 has to wait for many clock cycles before a request for data retrieval can be fulfilled by the DRAM 42.
For these reasons, and also because the data that the processor 22 needs most frequently often is a limited subset of the data stored in the DRAMs 42, a limited amount of high speed memory, known as a cache memory 44, is typically also included in the system memory 38. The cache memory 44 is more expensive and consumes more power than the DRAMs 42, but the cache memory 44 is also markedly faster. Typical cache memories 44 use static random access memories (xe2x80x9cSRAMxe2x80x9d) having data access times on the order of 10 nanoseconds or less. As a result of including the cache memory 44, the entire computer 20 operates much more rapidly than is possible without the cache memory 44. Cache memories 44 of different types and using different information exchange and storage protocols have been developed to try to optimize performance of the computer 20 for different applications.
One often-encountered problem occurs when the processor 22 accesses the cache memory 44 through the system bus 26. No other portion of the computer 20 may then use the system bus 26 to transfer data. As a result, the computer 20 is unable to carry out many other kinds of operations while the system bus 26 is transferring data between the cache memory 44 and the processor 22.
A first solution to this problem is to include a cache memory (not illustrated) in the processor 22 itself. This form of cache memory is also known as xe2x80x9cL1xe2x80x9d or level one cache memory. However, having a fixed size of L1 cache memory in the processor 22 does not allow the size of the L1 cache memory to be optimized for a particular type of computer 20.
A second solution to this problem is to include a cache memory (not illustrated) between the processor 22 and the system bus 26. This form of cache memory is known as a xe2x80x9clook throughxe2x80x9d cache memory.
With any form of cache memory 44, data stored in the cache memory 44 also corresponds to data stored in the DRAMs 42. When the contents of the cache memory 44 or the DRAMs 42 are updated, corresponding data in the other of the cache memory 44 or the DRAMs 42 will differ from the updated data, but these data still need to correspond to each other. As a result, writing data to either the cache memory 44 or the DRAMs 42 necessitates either updating corresponding data stored in the other of the cache memory 44 or the DRAMs 42, or keeping track of invalid (out of date or stale) data stored in the other of the cache memory 44 or the DRAMs 42. Attempting to read data from system memory 38 that is not stored in the cache memory 44 is known as a xe2x80x9cread miss,xe2x80x9d while attempting to read data from the system memory 38 that is stored in the cache memory 44 is known as a xe2x80x9cread hit.xe2x80x9d In a read hit, data is read from the cache memory, thus allowing the microprocessor 22 to read data significantly faster than in a read miss, in which the data must be read from the DRAM 42. Attempting to overwrite updated information in the cache memory 44 before the corresponding data in the DRAM 42 can be updated is known as a xe2x80x9cwrite miss,xe2x80x9d and correctly writing new data to the cache memory 44 is known as a xe2x80x9cwrite hit.xe2x80x9d
One method for tracking data stored in the cache memory 44 is to use a tag memory 46. The tag memory 46 uses the low order address bits for a memory address to access high order address bits of the cache memory 44 that are stored in the tag memory 46. The stored address bits from the tag memory 46 are also compared to the high order address bits of the memory address. In the event of a match, a cache hit is indicated, and the read data is thus read from the cache memory 44. The tag memory 46 may also store data characterizing each storage location in the cache memory 44. One protocol for characterizing data stored in the cache memory 44 and DRAMs 42 (xe2x80x9csnoopingxe2x80x9d the memories) is known as xe2x80x9cMESI,xe2x80x9d which is an acronym formed from Modified, Exclusive, Shared or Invalid. This protocol requires only two additional bits to be stored together with the high address bits in the tag memory 46. MESI allows ready determination of whether the data stored in the cache memory 44 have been modified, are exclusively stored in the cache memory 44, have been shared with the DRAMs 42 or are no longer valid data.
In order for the data from the tag memory 46 to be checked to determine when the data stored in the cache memory 44 is current, the data stored in the tag memory 46 must be transferred to the processor 22 in a procedure known as xe2x80x9csnooping.xe2x80x9d This snooping procedure requires that the system bus 26 be occupied during the time that the data are being accessed and transferred from the tag memory 46 to the processor 22. While data are being transferred on the system bus 26, the system bus 26 is :not available for other operations, again reducing data bandwidth, i.e., inhibiting other operation of the computer 20 for one or more clock cycles. As a result, the computer 20 cannot operate as rapidly as might otherwise be possible.
Therefore, there is a need for methods and systems whereby tag memory contents may be accessed by the processor without interfering with operation of at least some other portions of the computer.
In one aspect, the present invention includes a microprocessor having a system bus for exchanging data with a system memory, and a private bus for allowing the microprocessor to access a cache memory without using at least part of the system bus. The microprocessor reads data from, and writes data to, the cache memory through the private bus. Cache memory operations thus do not require use of the system bus, allowing other portions of the computer system to continue to function through the system bus.
According to another aspect of the invention, the address bus portion of the system bus is used to address the tag memory during the time that a bust transfer of data is occurring from, either the system memory of the cache memory. It is possible to use the address bus in this manner because the address bus is normally idle during a burst data transfer. When addressed during a burst data transfer, the tag memory transfers tag data to the microprocessor through a dedicated tag data bus. The microprocessor is thus able to carry out tag snoops while cache data transfers are occurring. As a result, data transfer capability between the cache memory system and the microprocessor is not compromised by tag snoops.