The present invention is related to multiprocessor computer systems, and more particularly to cache coherency protocols for multiprocessor computer systems.
A computer system can be broken into three basic blocks: a central processing unit (CPU), memory, and input/output (I/O) units. These blocks are interconnected by means of a bus. An input device such as a keyboard, mouse, disk drive, analog-to-digital converter, etc., is used to input instructions and data to the computer system via the I/O unit. These instructions and data can be stored in memory. The CPU retrieves the data stored in the memory and processes the data as directed by the stored instructions. The results can be stored back into memory or outputted via the I/O unit to an output device such as a printer, cathode-ray tube (CRT) display, digital-to-analog converter, LCD, etc.
In some computer systems multiple processors are utilized. This use of multiple processors allows various tasks or functions to be handled by other than a single CPU so that the computing power of the overall system is enhanced. Theoretically, a computer system with xe2x80x9cnxe2x80x9d processors can perform xe2x80x9cnxe2x80x9d times as many tasks and therefore can be xe2x80x9cnxe2x80x9d times faster than a computer with a single processor. However, in order to use multiple processors, the most recent version of data needs to be locatable and each processor needs to be provided with the most recent data when the data is needed to perform a task. This is referred to as xe2x80x9cdata coherency.xe2x80x9d
The multiple processors often contain a small amount of dedicated memory referred to as a cache or a cache memory. Caches are used to increase the speed of operation. In a processor having a cache, when information is called from main memory and used by the processor, the same information and its main memory address are also stored in the cache memory. The cache memory usually is a static random access memory (SRAM). As each new read or write command is issued, the system looks to the cache memory to see if the information exists. A comparison is made between a desired memory address and the memory addresses in the cache memory. If one of the addresses in the cache memory matches the desired address, then there is a xe2x80x9chitxe2x80x9d (i.e., the information is available in the cache). The information is then accessed from the cache rather than from main memory so that access to main memory is not required. Thus, the command is processed much more rapidly. If the information is not available in the cache, the new data is copied from the main memory and stored in the cache for future use.
In any system employing a cache memory, and particularly in a system employing multiple cache memories, data from a given memory location can reside simultaneously in main memory and in one or more cache memories. However, the data in main memory and in cache memory may not always be the same. This may occur when a processor updates the data contained in its associated cache memory without updating the main memory and other cache memories, or when another bus master changes data in main memory without updating its copy in the processor cache memories. A bus master is any device that can read or write commands into main memory.
Cache coherency is the term given to the problem of assuring that the contents of a cache memory and those of main memory for all caches in a multiple cache system are either identical or under tight enough control that stale and current data are not confused with each other. Cache coherency is also sometimes referred to as cache consistency or cache currency.
A cache coherency protocol is a method by which caches, processors, main memory, and alternate bus masters communicate with each other. The cache coherency protocol assures that the computer system maintains agreement between data stored in main memory and data stored in the caches on the same bus. In other words, the cache coherency protocol is used by a computer system to track data moving between the processors, main memory, and the various cache memories.
One such cache coherency protocol is referred to as the xe2x80x9cMESIxe2x80x9d protocol. According to the MESI protocol, four states are assigned to the data elements within the cache: modified, exclusive, shared, or invalid. The MESI protocol ensures that all references to a main-memory location retrieve the most recent value. However, main memory bandwidth is often limited in a multiprocessor system and may create a bottleneck; and the MESI protocol does not minimize the number of main-memory accesses. Thus, with a traditional MESI protocol, opportunities are missed to use a copy of data in one of the caches to avoid a main-memory access or to reduce the number of main-memory accesses.
For these and other reasons, there is a need for the present invention.
Some embodiments of the invention include a multiprocessor computer system comprising a plurality of cache memories to store a plurality of cache lines and state information for each one of the cache lines. The state information comprises data representing a first state selected from the group consisting of a Shared-Update state, a Shared-Respond state and an Exclusive-Respond state. The multiprocessor computer system further comprises a plurality of processors with at least one cache memory associated with each one of the plurality of processors. The multiprocessor computer system further comprises a system memory shared by the plurality of processors, and at least one bus interconnecting the system memory with the plurality of cache memories and the plurality of processors.
Still other embodiments, aspects and advantages of the invention will become apparent by reference to the drawings and by reading the following detailed description.