This invention relates to computer systems that include multiple processors, and is more particularly concerned with management of cache memories in such systems.
Computer systems which employ multiple processors are well known. One example of a commercially available multi-processor computer system is the pSeries line of systems available from the International Business Machines Corporation, which is the assignee hereof.
FIG. 1 is a simplified block diagram showing the architecture of a typical multi-processor computer system. In the example shown in FIG. 1, reference numeral 10 generally indicates the computer system, which includes a plurality (e.g., m) of multi-chip modules (MCMs) 12 (also indicated as MCM0 through MCMm-1 in the drawing). Each multi-chip module 12 includes a plurality (e.g., n) of processors 14, of which only the processors indicated as P0 through Pn-1 of MCM0 are shown. The computer system 10 also includes a main memory 16 constituted by a plurality of memory modules 18. One or more of the memory modules 18 and/or portions thereof may be associated with each of the processors 14.
An interconnection network 20 provides connections among the multi-chip modules 12 and the memory modules 18. (To simplify the drawing, the connections among the processors 14 of MCM0 are not shown.)
Each processor 14 also has a cache memory 22 associated therewith included on the corresponding multi-chip module 12. As is familiar to those who are skilled in the art, cache memories are relatively small, high-speed buffers located near the processors that they serve, and are used to store copies of data that has recently been used by a processor and/or is likely to be used by the processor in the near future. Use of cache memory can frequently reduce the period of time for data access that would otherwise be required for accessing to main memory.
When cache memories are used in multi-processor computer systems, the well-known problem of xe2x80x9ccache coherencexe2x80x9d arises. In essence, cache coherence deals with the situation in which data held in one cache is changed, thereby requiring steps to be taken to assure that inconsistent copies of the data are not present in other caches. A number of approaches to securing cache coherence are surveyed in Tomasevic et al., xe2x80x9cHardware Approaches to Cache Coherence in Shared-Memory Multiprocessors, Part 1xe2x80x9d, IEEE Micro, Volume 14, Number 5, October 1994, pages 52-59. Many known solutions can be categorized as either directory-based or broadcast (also known as xe2x80x9csnoopyxe2x80x9d) solutions. In general, in directory-based solutions, a directory is associated with each segment of main memory to keep track of corresponding copies of data in the various cache memories. A disadvantage of this type of solution is that there can be significant waiting time while the directory for the relevant memory segment is accessed. In addition, the total size of the directories may be quite large.
In broadcast approaches, messages relating to memory operations are broadcast on a network that all cache controllers monitor (xe2x80x9csnoopxe2x80x9d on) and then take appropriate action to assure cache coherence. Responses are then sent to confirm that the appropriate actions have been taken by each cache controller. However, particularly in systems having large numbers of processors and caches, there can be significant delays involved in the broadcast/response process.
It would be desirable to provide a technique which avoids the disadvantages of prior art approaches to maintaining cache coherence.
According to a first aspect of the invention, a method of managing cache memories is provided in a computer system that includes a first plurality of processors and a like plurality of cache memories, wherein each cache memory is associated with one of the processors. The method according to this aspect of the invention includes defining second and third pluralities of the processors as mutually exclusive subsets of the first plurality of processors, storing a respective first directory in association with each of the processors of the second plurality of processors, wherein the first directory associated with each processor of the second plurality of processors indicates contents of the cache memories associated with the other processors of the second plurality of processors, and storing a respective second directory in association with each of the processors of the third plurality of processors, wherein the second directory associated with each processor of the third plurality of processors indicates contents of the cache memories associated with the other processors of the third plurality of processors.
The first directories and the second directories may indicate all contents of the respective cache memories or may only indicate modifications of the data stored in the cache memories relative to the data stored in main memory.
According to a second aspect of the invention, a method of managing cache memories in a computer system is provided, wherein the computer system includes a first plurality of processors and a like plurality of cache memories, with each cache memory associated with a respective one of the processors. The method according to this aspect of the invention includes defining a second plurality of the processors as a subset of the first plurality of processors, wherein the second plurality of processors includes a first processor, storing a respective first directory in association with each of the processors of the second plurality of processors, the first directory associated with each processor of the second plurality of processors indicating contents of the cache memories associated with the other processors of the second plurality of processors, modifying data stored in the cache memory associated with the first processor, sending a message to each of the processors of the second plurality of processors other than the first processor to inform the other processors of the modification of the data in the cache memory associated with the first processor, and modifying each of the first directories associated with the other processors of the second plurality of processors to reflect the modification of the data in the cache memory associated with the first processor.
In accordance with a third aspect of the invention, a method of managing cache memories in a computer system is provided. The computer system includes a plurality of processors and a like plurality of cache memories, with each cache memory associated with a respective one of the processors. The plurality of processors includes a first processor that belongs to a first group of the processors and belongs to a second group of the processors, with no other processor belonging to both the first group and the second group. The method in accordance with this aspect of the invention includes storing a directory in association with each processor of the second group, wherein the directory associated with each processor of the second group indicates contents of the cache memories associated with the other processors of the second group, broadcasting a message in regard to a read operation from the first processor to all the other processors of the first group, and accessing the directory stored in association with the first processor to determine if data relevant to the read operation is stored in a cache memory associated with any processor of the second group other than the first processor.
In accordance with a fourth aspect of the invention, a method of accessing an item of data is provided in a computer system that includes a plurality of processors and a like plurality of cache memories, with each cache memory associated with a respective one of the processors. The method according to this aspect of the invention includes interrogating the cache memory associated with a first processor of the plurality of processors, wherein the first processor belongs to a first group of the processors and belongs to a second group of the processors that is different from the first group of the processors. The method further includes broadcasting a message from the first processor to each other processor of the first group of processors, and interrogating a first directory associated with the first processor and indicative of contents of the cache memories associated with other processors of the second group of processors.
The method further includes responding to the broadcast message by interrogating cache memories associated with each other processor of the first group of processors, and further responding to the broadcast message by interrogating a respective second directory associated with each other processor of the first group of processors, wherein each second directory is indicative of contents of cache memories associated with other processors of a respective group of processors to which the respective other processor of the first group of processors belongs. The method also includes accessing a main memory of the computer system.
The technique of the present invention provides greater efficiency than conventional broadcast approaches to maintaining cache coherence, while also avoiding use of large directories and delays that accompany directory-based approaches to cache coherence.
Other objects, features and advantages of the present invention will become more fully apparent from the following detailed description of exemplary embodiments, the appended claims and the accompanying drawings.