The invention relates to computers and processor systems. More particularly, the invention relates to recovery from address channel errors in a multiprocessor computing system having cache memory.
In a computer system, the interface between a processor and memory is critically important to the performance of the system. Because fast memory is very expensive, memory in the amount needed to support a processor is generally much slower than the processor. In order to bridge the gap between fast processor cycle times and slow memory access times, cache memory was developed. A cache is a small amount of very fast, zero wait state memory that is used to store a copy of frequently accessed data and instructions from main memory. The microprocessor can operate out of this very fast memory and thereby reduce the number of wait states that must be interposed during memory accesses. When the processor requests data from memory and the data resides in the cache, then a cache read xe2x80x9chitxe2x80x9d takes place, and the data from the memory access can be returned to the processor from the cache without incurring wait states. If the data is not in the cache, then a cache read xe2x80x9cmissxe2x80x9d takes place, and the memory request is forwarded to the system and the data is retrieved from main memory, as would normally be done if the cache did not exist. On a cache miss, the data that is retrieved from the main memory is provided to the processor and is also written into the cache due to the statistical likelihood that this data will be requested again by the processor.
The individual data elements stored in a cache memory are referred to as xe2x80x9clines.xe2x80x9d Each line of a cache is meant to correspond to one addressable unit of data in the main memory. A cache line thus comprises data and is associated with a main memory address in some way. Schemes for associating a main memory address with a line of cache data include direct mapping, full association and set association, all of which are well known in the art.
The presence of caches should be transparent to the overall system, and various protocols are implemented to achieve such transparency, including write-through and write-back protocols. In a write-through action, data to be stored is written to a cache line and to the main memory at the same time. In a write-back action, data to be stored is written to the cache and only written to the main memory later when the line in the cache needs to be displaced for a more recent line of data or when another processor requires the cached line. Because lines may be written to a cache exclusively in a write-back protocol, precautions must be taken to manage the status of data in a write-back cache, as described in greater detail below.
Cache management is generally performed by a device referred to as a cache controller. A principal cache management objective is the preservation of cache coherency. In computer systems where independent bus masters can access memory, there is a possibility that a bus master, such as another processor, network interface, disk interface, or video graphics card might alter the contents of a main memory location that is duplicated in the cache. When this occurs, the cache is said to hold stale or invalid data. In order to maintain cache coherency, it is necessary for the cache controller to monitor the system bus when the processor does not own the system bus to see if another bus master accesses main memory. This method of monitoring the bus is referred to as xe2x80x9csnooping.xe2x80x9d
The cache controller must monitor the system bus during memory reads by a bus master in a write-back cache design because of the possibility that a previous processor write may have altered a copy of data in the cache that has not been updated in main memory. This is referred to as read snooping. On a xe2x80x9cread snoop hit,xe2x80x9d where the cache contains data not yet updated in main memory, the cache controller generally provides the respective data to main memory, and the requesting bus master generally reads this data en route from the cache controller to main memory, this operation being referred to as xe2x80x9csnarfing.xe2x80x9d The cache controller must also monitor the system bus during memory writes because another bus master may write to or alter a memory location that resides in the cache. This is referred to as write snooping. On a xe2x80x9cwrite snoop hit,xe2x80x9d the cache entry is either marked invalid in the cache directory by the cache controller, signifying that this entry is no longer correct, or the cache is updated along with main memory. Therefore, when another bus master reads or writes to main memory in a write-back cache design, or writes to main memory in a write-through cache design, the cache controller must latch the system address and perform a cache look-up to see if the main memory location being accessed also resides in the cache. If a copy of the data from this location does reside in the cache, then the cache controller takes the appropriate action depending on whether a read or write snoop hit has occurred. This prevents incoherent data from being stored in main memory and the cache, thereby preserving cache coherency.
Another consideration in the preservation of cache coherency is the handling of processor writes to memory. When the processor writes to main memory, the memory location must be checked to determine if a copy of the data from this location also resides in the cache. If a processor write hit occurs in a write-back cache design, then the cache location is updated with the new data and main memory may be updated with the new data at a later time or should the need arise. In a write-through cache, the main memory location is generally updated in conjunction with the cache location on a processor write hit. If a processor write miss occurs, the cache controller may ignore the write miss in a write-through cache design because the cache is unaffected in this design. Alternatively, the cache controller may perform a xe2x80x9cwrite-allocatexe2x80x9d whereby the cache controller allocates a new line in a cache in addition to passing the data to the main memory. In a write-back cache design, the cache controller generally allocates a new line in the cache when a processor write miss occurs. This generally involves reading the remaining entries to fill the line from main memory before or jointly with providing the write data to the cache. Main memory is updated at a later time should the need arise.
Caches may be designed independently of the microprocessor, in which case the cache is placed on the local bus of the microprocessor and interfaced between the processor and the system bus during the design of the computer system. However, as the density of transistors on a process chip has increased, processors may be designed with one or more internal caches in order to decrease further memory access times. The internal cache used in these processors is generally small, an exemplary size being 8 k (8192 bytes) in size. In computer systems that utilize processors with one or more internal caches, an external cache is often added to the system to further improve memory access time. The external cache is generally much larger than the internal cache(s), and, when used in conjunction with the internal cache(s), provides a greater overall hit rate than the internal cache(s) would provide alone.
In systems that incorporate multiple levels of caches, when the processor requests data from memory, the internal or first level cache is first checked to see if a copy of the data resides there. If so, then a first level cache hit occurs, and the first level cache provides the appropriate data to the processor. If a first level cache miss occurs, then the second level cache is then checked. If a second level cache hit occurs, then the data is provided from the second level cache to the processor. If a second level cache miss occurs, then the data is retrieved from main memory. This process continues through higher levels of caches, if present. Write operations are similar, with mixing and matching of the operations discussed above being possible.
In many instances where multilevel cache hierarchies exist with multiple processors, a property referred to as multilevel inclusion is desired in the hierarchy. Multilevel inclusion provides that a second level (e.g., external) cache is guaranteed to have a copy of what is inside a first level (e.g., internal) cache. In this case, the second level cache holds a superset of the first level cache. Multilevel inclusion obviates the need for all levels of caches to snoop the system bus and thus enables the caches to perform more efficiently. Multilevel inclusion is most popular in multiprocessor systems, where the higher level caches can shield the lower level caches from cache coherency problems and thereby prevent unnecessary snoops that would otherwise occur in the lower level caches if multilevel inclusion were not implemented.
In a multiprocessor system where each processor utilizes a multilevel cache system with inclusion, there may be, for example, a Level 1 (L1) write-through cache associated with each processor and a larger, slower Level 2 (L2) write-back cache, which is still much faster than the main memory. The L2 and L1 caches utilize the MESI (pronounced xe2x80x9cmessyxe2x80x9d) protocol for managing the state of each cache line as follows: For each cache line, there is an M, E, S, or I state that indicates the current state of the cache line in the system. According to this well-known protocol, the Exclusive (E) bit indicates that the line only exists in this cache, the Shared (S) bit indicates that the line can be shared by multiple users at one time, the Invalid (I) bit indicates that the line is not available in the cache, and the Modified (M) bit indicates that the line has been changed or modified since it was first written to the cache. This management system improves system performance because unmodified lines need not be written back to the system""s main memory.
The L1 cache does not require the Exclusive (E) bit in systems where it is the L2 cache""s responsibility to manage line MESI state changes. Thus, the L1 cache may be said to implement the MSI protocol. In these systems, a line marked Exclusive (E) in L2, would be marked Shared (S) in L1. If another processor wants to share a copy of this line, the L2 cache would indicate via its snoop response that the line is Shared (S) and change the state of the L2 copy of the line to Shared (S). Because the L1 line state did not need to be changed, the L1 cache did not need to be involved in the line state change, thus improving performance.
In a multiprocessor environment, snoop latency may be fixed, which means that when a processor makes a storage request on the system bus, all other processors, or bus devices, must respond within a fixed period of time. In the event the storage request is a line read, other processors or devices which have a copy of the line are allowed to respond only with Shared (S) or Modified (M). A processor is not allowed to keep exclusive ownership of the line in this case. If the snoop response is Modified (M), the processor owning the current state of the line must provide the current copy to the requester, and change the state of its copy of the line to Shared (S) or Invalid (I), depending on the snoopy bus protocol. In systems where the L1 cache cannot be snooped, or the L1 cache snoop response cannot meet the fixed response time requirement of the snoopy bus, the L2 cache must mark a line as Modified (M) prior to any processor store to that line.
An alternative to snoopy protocols are directory based cache coherency protocols. In a directory based coherency scheme, the system typically contains a single directory having one entry for every address in main memory. Each directory entry identifies the ownership and data state information of each line of main memory. That is, the directory contents are tags. The data states tracked in a directory coherency system may be similar to the data states tracked in a snoopy system (e.g., MESI-based states or something similar).
An objective of the present invention is to exploit cache coherency information in a multiprocessor computing system in reaction to an address error.
One of the most serious types of errors in a computer system is an address error on a computer bus. On a typical computer bus, which contains a separate address bus (as well as a data bus and control lines), as address buses are designed in greater widths, the specter of address errors becomes more threatening. An address error is serious because a bus agent that reports the error must be assumed to have no idea what the true memory address target was. The result might be a memory controller providing the wrong data, or worse yet, writing the right data to the wrong memory location. When an address error occurs in a multiprocessor system, one or more processors may have inconsistent views of memory. In order to avoid the continued processing of corrupted data, most computers respond to an address error by generating a fatal error, which in turn causes an immediate failure of the operating system. This is disadvantageous because it does not permit graceful shutdown of applications.
Non-graceful shutdowns cause significant increases in recovery time for applications such as databases. In particular, non-graceful shutdowns may occur before the system has an opportunity to flush open buffers and close open files. As a result, storage files that reside on I/O (input/output) devices may be corrupt and inconsistent. Returning a computer file system to a known good state may require a great deal of time and effort. Typically, archival backups and/or update logs of the storage system are needed. Further, some data that has been entered since the last backup needs to be recreated, if that is even possible.
An advantage of the present invention is a higher likelihood of either avoiding the need for bringing the system down as a result of an address error or providing a window of opportunity in which to conduct a more orderly shutdown of critical operations, in response to an address error.
This invention is based upon the recognition that the property of inclusion, offered by an inclusive cache, coherency directory or other coherency filter provides unique opportunities for error recovery. In traditional bus-based MESI coherency systems, only the owner of a cache line knows that he owns it. Thus, if the owner dies or his connection to the system fails, the correct state of memory is totally unknown. The data structures used by inclusive systems to track inclusion (address tags for caches, and directory entries for directories) contain redundant information about the ownership of lines and, in some cases, up-to-date copies of modified data. The present invention is sometimes capable of providing enough data about the state of memory to allow applications to recover from address errors, such as parity errors. The information provided may be sufficient to allow a complete recovery, but more often, the information provided will allow the system to avoid corrupt data and run long enough to permit graceful shutdown of mission critical applications.
According to a method of the present invention, an address error is detected on a local channel, such as a local bus. The coherency states of one or more lines of cache memory associated with the local channel are then read, and actions are taken in response. Reading of coherency states ranges from a complete and active interrogation of all cache lines, to a selective and passive interrogation, such as in responding to snoop requests. If the data state consistency is unknown, such as when the MESI state is Modified (M) or Exclusive (E), then the corresponding data in main memory is poisoned. Poisoning may be accomplished by writing a detectable but unrecoverable error pattern in the main memory. Alternatively, the same effect may be accomplished by signaling a hard error on the system bus. If the data state consistency of an interrogated cache line is Shared (S) or Invalid (I), the line may be ignored or the line marked invalid. If the state of the cached line is valid and consistent, such as the xe2x80x9cModified uncachedxe2x80x9d (Mu) state in a MuMESI protocol, then the line may be written to main memory or provided to a snoop requester.