In such systems, each processor has its own cache memory to store computation data, such as memory addresses pointing towards instructions to be executed. Processors operating in parallel share some of these data. Thus, multiple processors may access the same read or write data and possibly modify them, as well as executing a computer program.
“Cache coherence” algorithms are implemented to ensure updating of the data used by the processors, and to prevent two processors processing two different versions of the same data. The MESI algorithm is an example of such an algorithm.
The implementation of cache coherence algorithms requires a great deal of communication between processors to enable them to know the location of data at any instant. This involves determining the identification of the cache memory in which the data are located, as well as their status.
The status of data in a cache memory depends on the protocol used. In the example of the MESI protocol given above, data have a “modified” status (M for “modified”) if the data are only present in one cache and these data have been changed compared to the data present in a non-cached memory (from which the initial data come). In this case, the processor that wants to have access to the data must wait until they are brought into conformity with the version in the memory. Staying with the same example, data have an “exclusive” status (E for “exclusive), if the data are present in only one cache memory and these data in fact correspond to the non-cache memory version. Data have the “shared” status (S for “shared”) if the data are present in multiple cache memories. Finally, data have the “invalid” status (I for “invalid”) if the data are out of date. They must then be ignored by the processors and not used.
Other protocols exist with more or fewer defined statuses. For example the MSI protocol only has three statuses M, S and I as defined above, while the MOESI protocol adds an “owned” status (O for “owned”) meaning that the data are the latest version, but that the non-cache memory from which it came is out of date.
Most cache coherence protocols implement lists, or directories, reflecting the history of requests made for each data. This type of protocol is referred to as “directory-based”.
Each processor maintains a list for each line of cache memory to indicate the processors in which the data registered therein are stored as well as their status. This list may be more or less complete as regards the information it contains.
The use of this list enables one to retain a history of requests from the processors with respect to data in the processors. In particular, this enables one to filter the queries of the cache, while avoiding, for example, querying the cache of a processor that has not handled the data. Also, if particular data do not appear in the lists of the processors, it may be inferred that the data are not in use by a processor and that they are therefore stored in the main memory (not cached) and are up to date.
FIG. 1 shows a multiprocessor system schematically.
The system comprises four modules 100, 110, 120, 130. Each module comprises a plurality of processors. Module 100 comprises two processors 101 and 102. Module 110 comprises two processors 111 and 112. Module 120 comprises two processors 121 and 122. Module 130 comprises two processors 131 and 132. Each processor has a respective cache memory. These cache memories are not represented.
The number of modules in the system and the number of processors in the modules is merely illustrative. Modules may contain different numbers of processors.
In order to manage the communications between the processors, in particular to manage cache coherence, each module 100, 110, 120, 130 has a proxy module 103, 113, 123, 133 respectively. For the sake of clarity, the interconnections between the proxy modules are not represented.
Thus, each processor has a single interface to communicate with other processors. It is as if each processor is addressed each time by only one other processor. In particular, the proxy modules hold directories on behalf of their respective processor modules.
The use of proxy modules is desired when the number of processors in the system is large.
Furthermore, the system comprises a main memory module 140. This module stores the data in a sustainable manner. It is from this memory that come the data manipulated by the processors and temporarily stored in the cache memories. Module 140 also maintains a directory listing the data and the processors that have requested access to them. For the sake of clarity in the figure, module 140 is shown separately from the modules 100, 110, 120 and 130. However, the main memory of module 140 may be distributed in the modules 100, 110, 120 and 130. Each of these modules thus houses a portion of the main memory. Module 140 is thus a common virtual memory space that is physically distributed in the system.
The performance of multiprocessor systems depends in part on the ability of proxy modules to quickly find a data in a cache memory or main memory in the system.