Embodiments of the present invention relate to a computer system having a multi-node computer architecture. In particular, the present invention relates to a method and apparatus for managing the sending of inter-node messages in a multi-node architecture.
Computer systems may contain multiple processors that may work together to perform a task. For example, a computer system may contain four processors that may share system resources (e.g., input devices or memory devices) and may perform parallel processing. The processors may send messages to each other, may send messages to system resources, and may send and receive messages from the system resources (e.g., a memory or output device). For example, such messages may include requests for information that is stored at a memory location in a memory device.
In some systems, the messages are sent from a processor to another component of the system over an interconnect. For various reasons, a message that is sent by a processor may not be received by the destination component. For example, the destination component may not have the capacity to accept the message at that time. Where the sending of a message failed, the processor that sent the message may receive a failure message or may simply not receive an acknowledgment message within a defined waiting period. The message may be in a starvation situation if the message sending fails indefinitely due to the conditions of the system (e.g., the amount of other traffic being directed at the receiving component). In some systems, the processor that is sending the message may alleviate such a starvation condition by taking control of the interconnect until the message is successfully sent.