1. Field of the Invention
The present invention concerns the creation of large-scale symmetric multiprocessor systems by assembling smaller basic multiprocessors, each generally comprising from one to four elementary microprocessors (μP), each associated with a cache memory, a main memory (MEM) and an input/output circuit (I/O) suitably linked to one another through an appropriate bus network. The multiprocessor system being managed by a common operating system OS. In particular, the invention concerns coherence controllers integrated into the multiprocessor systems and designed to guarantee the memory coherence of the latter, particularly between main and cache memories, it being specified that a memory access procedure is considered to be “coherent” if the value returned to a read instruction is always the value written by the last store instruction. In practice, incoherencies in cache memories are encountered in input/output procedures and also in situations where immediate writing into the memory of a multiprocessor is authorized without waiting and verifying that all the caches capable of having a copy of the memory have been modified.
2. Description of the Related Art
There are known multiprocessors produced in accordance with the schematic diagram illustrated in FIG. 1, and given as a nonlimiting example, primarily constituted by four basic multiprocessors 10–13, MP0, MP1, MP2 and MP3, with two microprocessors 40 and 40′, respectively linked to a coherence controller 14 SW (Switch) by two-point high-speed links 20–23 controlled by four local port control units 30–33 PU0, PU1, PU2 and PU3. The controller 14 knows the distribution of the memory and the copies of memory lines or blocks among the main memory MEM 44 and the cache memories 42, 42′ of the processors and includes, in addition to one or more routing tables and a collision window table (not represented), a cache filter directory 34 SF (also called a Snoop Filter) that keeps track of the copies of memory portions (lines or blocks) present in the caches of the multiprocessors. Hereinafter, and by convention, the terms “lines” or “blocks” will be used interchangeably to designate either term, unless otherwise indicated. Furthermore, the term “memory” used alone concerns the main memory or memories associated with the multiprocessors.
The cache filter directory 34, controlled by the control unit ILU 15, is capable of transmitting coherent access requests to a memory block (for purposes of a subsequent operation such as a Read, Write, Erase, etc.) or to the main memory in question, or to the microprocessor(s) having a copy of the desired block in their caches, after verifying the memory status of the block in question in order to maintain the memory coherence of the system. To do this, the cache filter directory 34 includes the address 35 of each block listed associated with a 4-bit presence vector 36 (where 4 represents the number “n” of basic multiprocessors 10–13) and with an Exclusive memory status bit Ex 37.
In practice, the bit MP0 of the presence vector 36 is set to 1 when the corresponding basic multiprocessor MP0 (the multiprocessor 10) actually includes in one of its cache memories a copy of a line or a block of the memory 44.
The Exclusive status bit Ex 37 belongs to the coherence protocol known as the MESI protocol, which generally describes the following four memory states:
Modified: in which the block (or line) in the cache has been modified with respect to the content of the memory (the data in the cache is valid but the corresponding storage position is invalid.
Exclusive: in which the block in the cache contains the only identical copy of the data of the memory at the same addresses.
Shared: in which the block in the cache contains data identical to that of the memory at the same addresses (at least one other cache can have the same data).
Invalid: in which the data in the block are invalid and cannot be used.
In practice, for the multiprocessors illustrated in FIG. 1 and FIG. 2, a partial MESI protocol is used, in which the “Modified” and “Exclusive” states are not distinguished:                if only one bit MPi=1 and if the bit Ex=1, then the memory status of the block (or the line) is Modified or Exclusive;        if one or more bits MPi=1 and if the bit Ex=0, then the memory state of the block is Shared;        if all the bits MPi=0, then the memory state is Invalid.        
The cache filter directory 34 integrates a search and monitoring protocol equipped with a so-called “snooping” logic. Thus, during a memory access request by a processor, the cache filter directory 34 performs a test of the cache memories it handles. During this verification, the traffic passes through ports 24–27 of the two-point high-speed links 20–23 without interfering with the accesses between the processor 40 and its cache memory 42. The cache filter directory is therefore capable of handling all coherent memory access requests.
The known multiprocessor architecture briefly described above is not, however, adapted to applications of large-scale symmetric multiprocessor servers comprising more than 16 processors.
In essence, the number of basic multiprocessors that can be connected to a coherence controller (in practice embodied by an integrated circuit of the ASIC type) is limited in practice by:                the number of input/outputs of the controller, which according to current manufacturing techniques accepts only a limited number of two-point links (keeping in mind that these two-point links are necessary, because of their high-speed capacity, in order to avoid latency or delay problems during the processing of memory access requests).        the size of the coherence controller that contains the cache filter directory (the size of the cache filter directory must be larger than the sum of the sizes of the directories of the caches integrated into the basic multiprocessors).        the bandwidth for access to the cache filter directory, or maximum speed in Mbps, obtained in practice by two-point links constitutes a bottleneck for a large-scale multiprocessor server, since the cache filter directory must be consulted for all the coherent accesses of the basic multiprocessors.        