Common System Interface (“CSI”) systems processor include an integrated memory controller to reduce the memory access latency. Server systems may now be built using two to hundreds of such processors. In such systems, only one processor executes BIOS code to initialize memory and other portions of the system. However, as the number of processors and memory controllers has increased, the mechanisms to initialize the system and boot an operating system commonly take extended periods. As a result, initialization has been parallelized between the multiple processors.
In some CSI based multi-processor systems, all the processors execute the BIOS code in parallel and perform a Built-in Self Test (“BIST”), memory initialization, and the like to speed up the system boot time. These activities are coordinated by a System Boot Strap Processor (“SBSP”) to collect the data from each of the other processors, which may be referred to as Application Processors (“AP”), and boot the operating system. Parallel initialization increases the computation requirements to initialize memory and CSI links. As a result, processor cache is configured as a temporary data store, which may be referred to as “Cache-As-RAM” or “CAR,” to the processor socket in order to assist in the memory, BIST execution, and CSI initialization.
However, the temporary data store in one processor can not be accessed by other processors. In particular, the SBSP can not directly access the AP temporary stores to create global Source Address Decoder (“SAD”) entries to represent the whole system. Previous efforts to allow the SBSP to create the global SAD include using a Configuration Space Register (“CSR”), a chipset or non-core register accessible by both SBSP and AP processors, as a mailbox to exchange data. However, such a mechanism involves handshaking between processors and polling of data. As the amount of data to be exchanged between the processors increases, this mechanism increases system boot latency.