FIG. 1 is a simplified block diagram of a multiprocessor system 500, according to some embodiments. The multiprocessor system 500 includes N central processing units (CPUs) 150A, 150B, . . . , 150N (collectively, “CPUs 150”), which are coupled to N−1 specialized busses, known as quick path interconnect (QPI) busses 160A, 160B, . . . , 160N−1 (collectively, “QPI busses 160”). The QPI busses 160, specifically designed for the CPUs, speed up communication between the CPUs 150. The CPUs may also be coupled to one or more volatile memories (not shown).
Also featured in the multiprocessor system 500 are up to N peripheral controller hubs (PCHs) 180A, . . . , 180N (collectively, “PCHs 180”) coupled to the CPUs 150 via up to N specialized busses, known as direct media interface (DMI) busses 170A, 170B, . . . , 170N. The PCHs 180 interface between the CPUs 150 and one or more peripheral devices of the multiprocessor system 500. The PCHs 180 may include display, input/output (I/O) control, a real-time clock, and other functions and may connect to an integrated display as well as other peripheral devices, such as a keyboard, a mouse, a non-volatile storage device, and so on.
For communication between endpoints within the processor of a multiprocessor system 500 or a single processor-based system, a message channel is used. The message channel is the transmission medium for these communications, and may be thought of as a type of “tunnel” or “subway” between endpoints inside the processor. There may be many message channel endpoints, and a message may be sent from any endpoint to any other endpoint, with the endpoints being functional entities within the processor. Portable machine code, or pcode, is used to communicate between the entities, and the pcode has its own endpoint for sending messages to other endpoints. (No endpoint sends an autonomous message to the pcode, as the only message that is received by the pcode endpoint is a response to a message that the pcode originated.) Power management request (PMReq) messages go to other entities using the QPI bus, which is similar to the message channel, except that the QPI bus is an external bus/interface. The message channel, by contrast, is strictly internal to the processor.
In CPU-based systems, such as a single-processor system or the multiprocessor system 500 of FIG. 1, a message channel is used by many disparate pcode flows and functions. These functions may be used to read and write uncore control registers, issue PMReqs, and send messages to other platform entities (e.g., other CPUs 150, PCHs 180). The pcode uses the message channel quite frequently, from hundreds of times per millisecond to thousands of times per millisecond.
Some newer multiprocessor systems are designed in such a way that the message channel may become blocked at various times, such as during a frequency transition. Previous multiprocessor systems did not have this issue, as their message channel interfaces were always fully functional. So, the pcode in previous projects could use the message channel in a “blocking” manner by sending the transaction onto the message channel, and waiting in a tight loop for the completion of the transaction.
For newer multiprocessor systems, the use of “blocking” transactions on the message channel is deemed unacceptable because the blocking transaction can potentially lock up pcode for several tens of microseconds. The blocking transactions thus lead to a higher latency for other (non-message-channel-related) functions and impact the performance of the CPU. In addition, there is a risk of a deadlock because the message channel is blocked by some function that is waiting for something from the pcode via a sideband interface, but the pcode is blocked waiting for a message channel transaction to complete.
Additionally, PMreq messages require arbitration for use of a single buffer in a PMReq engine (PME). PMreq messages go over the message channel to the PME, and then over the QPI bus 160 to another CPU 150 (or over the DMI bus 170 to the PCH 180). As part of the PMreq protocol correctness, the PME will wait for a completion (CMP) from the other CPU/PCH, and will keep the PMReq buffer locked until the completion is actually received. In this case, if a blocking message channel transaction is used, the pcode will be locked up for the entire round-trip duration of the PMreq/CMP exchange. There may be delays on the other CPU (due to a frequency change, etc.), which further prolongs the duration of the lock-up.
Thus, there is a continuing need for a solution that overcomes the shortcomings of the prior art.