1. Field of the Invention
The described invention relates to the field of cache coherency. In particular, the described invention relates to cache coherency of a multiprocessor computer system which has a highly pipelined bus.
2. Description of Related Art
Since the beginning of electronic computing, main memory access has been much slower than processor cycle times. Access time is the time between when a read is initially requested and when the desired data word arrives. Processor cycle time refers to the minimum time between successive instruction executions. The gap between memory access time and processor cycle times continues to widen with advances in semiconductor technology. Efficient mechanisms to bridge this gap are central to achieving high performance in future computer systems.
The conventional approach to bridging the gap between memory access time and processor cycle time has been to introduce a high-speed memory buffer, commonly known as a cache, between the processor and main memory. The idea of a cache memory dates back several decades ago and was implemented in early computer systems such as the IBM system 360/85. Today, caches are ubiquitous in virtually every class of general purpose computer system. Very often, data stored within one cache memory is shared among the various processors or agents which form the computer system. The main purpose of a cache memory, of course, is to provide fast access time while reducing bus and memory traffic. A cache achieves this goal by taking advantage of the principles of spatial and temporal locality.
As semiconductor technology has continued to improve, the gap between memory access time and central processing unit (CPU) cycle time has widened to the extent that there has arisen a need for a memory hierarchy which includes two or more intermediate cache levels. For example, two-level cache memory hierarchies often provide an adequate bridge between access time and CPU cycle time such that memory latency is dramatically reduced. In these types of computer systems the first-level, primary cache (i.e., L1) provides fast, local access to data, while the second-level cache (i.e., L2) provides good data retention in bus and memory traffic.
Main memory is typically the last or final level down in the hierarchy. Main memory satisfies the demands of caches and vector units and often serves as the interface for one or more peripheral devices. Most often, main memory consists of core memory or a dedicated data storage device such as a hard disk drive unit.
One of the problems that arise in computer systems that include a plurality of caching agents and a shared data cache memory hierarchy is the problem of cache coherency. Cache coherency refers to the problem wherein--due to the use of multiple, or multi-level, cache memories--data may be stored in more than one location in memory. By way of example, if a microprocessor is the only device in a computer system which operates on data stored in memory, and the cache is situated between the CPU and memory, there is little risk in the CPU using stale data. However, if other agents in the system share storage locations in the memory hierarchy, this creates an opportunity for copies of data to be inconsistent, or for other agents to read stale copies.
Cache coherency is especially problematic in computer systems which employ multiple processors as well as other caching agents (e.g., input/output (I/O) devices). By way of example, a program running on multiple processors requires that copies of the same data be located in several cache memories. Thus, the overall performance of the computer system depends upon the ability to share data in a coherent manner.
As will be seen, the described invention provides a cache protocol for a computer system supporting high performance memory hierarchy with complete support for cache coherency. The cache protocol of the described invention supports multiple caching agents (e.g., microprocessors) executing concurrently, as well as writeback caching and multiple levels of cache memory. The cache protocol supports a highly-pipelined bus architecture which interconnects the various caching agents.