1. Field of the Invention
This invention relates to electronic circuits within a computer system. More specifically, this invention relates to systems and methods for invalidating cache memory within a computer system.
2. Description of the Related Art
Modern computer systems use a memory hierarchy to maximize their performance. This type of hierarchy exists so that the computer system can simultaneously store large amounts of data and provide rapid access to that data for the consumer. In current computer systems, the memory hierarchy resides within three stages. The first memory stage is designed to permanently store large amounts of data, and is normally provided by a magnetic disk or CD-ROM. The second memory stage is usually called xe2x80x9cmain memoryxe2x80x9d and is normally provided by 16, 32, 64 or more megabytes of dynamic random access memory (DRAM). The second stage of memory holds data and program information while the computer system is turned on. However, the main memory area is normally erased once power is removed from the computer.
The third stage of memory is a cache memory system. The cache memory system normally resides between a host processor and the main memory. In some cases, such as with an Intel(copyright) Pentium(copyright) Pro processor, portions of the cache memory system are incorporated into the host processor. Most cache memory systems are made up of static RAM which has a very fast access time. Static RAMs with 15, 20 or 25 nanosecond access times or faster are currently available. In use, blocks of the main memory are copied into the static RAM of the cache memory. When the host processor requires data or program instructions from main memory, it normally first checks the cache memory for that data. If the requested data is found within the cache memory, then the processor retrieves the data without having to access the slower main memory. Because most processors access data segments that reside near each other sequentially, one block of cache memory may hold the required data for several processor requests.
One challenge in designing a cache memory system is the maintenance of cache coherency between the main memory and the cache memory. For example, when a block of data is updated in the cache memory, the same address in the main memory must also be eventually updated. One mechanism for providing cache coherency involves storing xe2x80x9cvalidxe2x80x9d and xe2x80x9cmodifiedxe2x80x9d flags with every address in the cache memory. These flags are usually single bits that help the computer system identify addresses in the cache memory which have been updated, but not written back to the main memory.
As one example, the processor may write a new piece of data into an address in the cache memory and then mark that address as both valid and modified by setting the valid and modified bits to a one. A memory controller within the computer system can then determine that this particular address in cache memory holds valid data from the processor that has been modified, but not written back to the main memory. Once the memory controller writes the data from the cache memory to the main memory, the modified bit is set to a zero to indicate that the data block at this location in cache has been written back to the main memory.
Cache memory systems are normally organized into blocks, called cache lines, with each line typically containing 16, 32, 64 or 128 bytes of consecutive memory locations. A cache controller manages the cache memory and intercepts the read and write signals going to the main memory from the processor. For example, if the processor attempts a memory read operation, the cache controller first checks the desired memory address against a list of current addresses within the cache memory system to determine if the desired address has already been stored in the cache. If the memory address is found within the cache, the requested data is immediately sent from the fast cache memory to the processor, normally with zero wait states. If the cache memory does not contain data at the desired address, the cache controller reads the requested memory block from the main memory into a cache line in the cache memory system and passes the requested data to the CPU.
Although the CPU must wait for the cache controller to fill a cache line from the main memory, addresses located next to the requested address will be loaded into the cache system and thereby be available to the processor should it desire any of these addresses. As discussed above, many processors gather data from segments that are very near the previously read segment.
Various cache schemes have been designed to efficiently transfer data from the main memory to the processor. One cache memory scheme is a set-associative cache, and is known to efficiently manage memory requests from the processor. In this system, the cache controller stores a given memory block in any one of several cache lines. Each of these cache lines is known as a set. Thus, these types of cache systems, wherein the memory block is stored in either of two cache lines is called a two-way set-associative cache.
As discussed above, the host processor marks specific addresses as valid or modified within the cache memory system in order to maintain cache coherency. In addition, it is sometimes necessary for the host processor to xe2x80x9cflushxe2x80x9d the cache by invalidating every address within the cache system. Some host processors, such as the Intel(copyright) Pentium(copyright) Pro processor, provide a FLUSH signal interface to the cache controller for just this purpose. However, prior to invalidating each address within the cache, the processor must first determine if any data needs to be written back to the main memory.
Modified, but not written back data, is first copied to the main memory during a FLUSH cycle. As discussed above, the Intel(copyright) Pentium(copyright) and Intel(copyright) Pentium(copyright) Pro processors include a FLUSH signal on their host interface. If the FLUSH signal is asserted, the Pentium(copyright) or Pentium(copyright) Pro processor writes back all of the internal cache lines that have modified data to the main memory. In addition, each address in the cache memory system is invalidated. This operation puts all internal cache lines into the invalid state. This process is used, for example, when the system is shutting down to ensure that all data within the cache system has been written to the main memory. However, current systems do not perform these flush cycles efficiently, thus leading to a slower overall computer performance.
Various systems have been designed in the prior art for invalidating the cache memory during a flush cycle. For example, some systems perform a flush cycle by reading the first address in the cache, determining whether that data has been modified, writing any modified data back to main memory, and then invalidating that address. The process then moves sequentially to the next address within the cache memory and repeats. When this type of flush process begins in computers with a set-associative cache memory system, each address in every set is analyzed individually to determine whether the data in that particular address has been modified, but not written back to the main memory. Because each set in the cache system is analyzed sequentially, these systems perform the FLUSH cycle very inefficiently.
Due to the aforementioned limitations in prior systems, the technology needs a method and system for efficiently invalidating multiple addresses within a memory cache. What is also needed in the technology is a mechanism responsive to a signal from a computer processor for efficiently invalidating every address within the computer system""s cache memory.
One embodiment of the invention is a set-associative cache memory system in communication with a computer processor and main memory. The cache memory system includes cache tag memory and cache data memory. The cache memory system includes: a counter for outputting an address to be invalidated in the cache memory; an update circuit in communication with the cache tag memory for simultaneously determining whether any set in the set-associative cache memory corresponding to the address contains data that has not been written back to the main memory; and a write-back circuit responsive to the update circuit for writing data from the cache data memory to the main memory.
Another embodiment of the invention is a tag update system in a computer for invalidating memory locations in a set-associative cache memory, wherein the computer includes a cache memory, a main memory and a processor. The tag update system includes: a counter for selecting a first address to invalidate; a first flag associated with the first address and indicative of when the first address contains data that has been changed but not written back to the main memory of the computer; a second flag associated with the first address and indicative of whether the first address is valid; and a circuit in communication with the first flag and the second flag for simultaneously determining whether each set associated with the first address refers to an address containing valid and modified data.
Still another embodiment of the invention is a cache controller for a computer system, wherein the computer system has a processor, a set-associative cache memory and a main memory. The cache controller includes: a counter for outputting a first address to be invalidated in the cache memory; an update circuit in communication with the cache memory for simultaneously determining whether any set in the set-associative cache memory corresponding to the first address contains data that has not been written back to the main memory; and a write-back circuit responsive to the update circuit for writing data from the cache data memory to the main memory.
Yet another embodiment of the invention is within a computer system having a set-associative cache memory and a main memory. This embodiment is a process for invalidating addresses in the cache memory, including: a) outputting an address to be invalidated in the cache; b) interrogating each set corresponding to the address simultaneously to determine whether any set contains data that needs to be written to the main memory prior to invalidating the first address; c) selecting a first set of data from a set that needs to be written back to the main memory; d) writing the selected set of data to the main memory; and e) invalidating the address in the cache memory.
A further embodiment of the invention is a process in a computer having a processor, main memory and set associative cache memory. The process performs a flush operation to invalidate each address in the cache memory, including: a) generating an address to invalidate in the cache memory; b) determining whether the data stored at the address in the cache memory has not been written back to the main memory by simultaneously determining whether each set corresponding to the address has not been written-back to the main memory; c) writing data from each set that has not been written-back to the main memory; d) invalidating the address once the data is written-back to the main memory; e) generating a next address to invalidate; and f) repeating b) through e) until all addresses in the cache memory have been invalidated.
An additional embodiment of the invention is a cache controller for a computer system, wherein the computer system has a processor, a set-associative cache memory and a main memory. This embodiment includes: means for outputting a first address to be invalidated in the cache memory; means in communication with the cache memory for simultaneously determining whether any set in the set-associative cache memory corresponding to the first address contains data that has not been written back to the main memory; and means responsive to the update circuit for writing data from the cache data memory to said main memory.