A multi-processor computing system is a computing system having multiple processors that execute their own respective software program code. Multi-processor computing systems can be implemented in various ways, such as, with multiple discrete computers interconnected over a wide area network, or, to provide another example, a single computer whose processor chip includes multiple processing cores that independently execute their own respective software code. For simplicity, the present application may use the term “processor” when referring to a component that is technically a “processing core”.
Multi-processor computing systems are often implemented with a “shared” cache. A shared cache is capable of receiving information (such as a cache line) from multiple processors within the computing system, and/or, is capable of providing information to multiple processors within the computing system. FIG. 1 shows a component of a multi-processor computing system having each of the following on a single semiconductor chip and/or having each of the following integrated within a single electronic component package 100 (hereinafter, “socket”): 1) multiple processors 101_1 through 101_X; 2) cache “slices” 102_1 through 102_Y (notably, Y may equal X); 3) respective caching agents 103_1 through 103_Y for each of the cache slices; and, 4) a network 104 between the processors and the cache slices. Each of processors 101_1 through 101_X also has its own associated interface 107_1 to 107_X to network 104.
The socket may also include a gateway/router function 105 between the socket's internal network 104, and, another network that is internal to the socket and/or a network that is external to the socket 100 (neither the additional internal network nor the external network are shown in FIG. 1). Notably, a multi-processor computing system may include additional sockets, e.g., designed identically/similar to socket 100, that are interconnected by an external network to increase/scale the processing power of the multi-processor system. The multi-processor computing system may also include other standard computing system components such as a system memory component 109 and associated memory controller and an I/O control hub component (not shown). The multi-processor computing system may also include a hard disk drive or solid state drive. The computing system may also have a display such as a flat panel display coupled to a graphics controller which in turn is coupled to the system memory 109.
Each of processors 101_1 through 101_X may include its own respective, local cache. When a processor looks for an item of information in its local cache and a “miss” occurs (or, if the processors 101_1 through 101_X simply do not include their own respective local cache), one of the cache slices 102_1 through 102_Y is snooped for the desired information. The particular cache slice that is snooped may, for example, be determined from the address of the information (e.g., the address of the desired cache line).
For instance, if a cache miss occurs at processor 101_1, a request is constructed for the desired cache line, and, a hash is performed on the address by the processor's network interface 110_1 to determine which cache slice is the appropriate cache slice for the particular address. The request is then directed over network 104 to the cache agent for the appropriate cache slice (e.g. cache agent 103_1 if cache slice 102_1 is the targeted slice). As part of being formally accepted by the cache agent 103_1, the request is entered into a buffer (a queue may be regarded as a buffer). The cache agent eventually services the request from the buffer snoops the targeted cache slice, and, if the desired cache line for the request is found it is sent over network 104 to processor 101_1. If the desired cache line is not found, a request for the cache line is sent to system memory 109 (the request may be directed over network 104 prior to be directed to system memory 109). The set of cache slices 102_1 through 102_Y are sometimes collectively referred to as the “last level cache” (LLC) because a failed snoop into the LLC causes the desired information to be next sought for outside socket 100 rather than within socket 100.