The present invention relates generally to methods and apparatuses for controlling the flow of data between two nodes (or two points) in a computer network, and more particularly to a method and apparatus for controlling the flow of data between two nodes (or two points) in a system area network.
For the purposes of this application, the term xe2x80x9cnodexe2x80x9d will be used to describe either an origination point of a message or the termination point of a message. The term xe2x80x9cpointxe2x80x9d will be used to refer to an intermediate point in a transmission between two nodes. The present invention includes communications between either a first node and a second node, a node and a switch, which is part of a link, between a first switch and a second switch, which comprise a link, and between a switch and a node.
An existing flow control protocol, known as Stop and Wait ARQ, transmits a data packet and then waits for an acknowledgment (ACK) from the termination node before transmitting the next packet. As data packets flow through the network from node to node, latency becomes a problem. Latency results from the large number of links in the fabric because each packet requires an acknowledgment of successful receipt from the receiving node before the next packet can be sent from the transmitting node. Consequently, there is an inherent delay resulting from the transit time for the acknowledgment to reach the transmitting node from the receiver.
One solution, which is known as Go Back n ARQ, uses sequentially numbered packets, in which a sequence number is sent in the header of the frame containing the packet. In this case, several successive packets are sent up to the limit of the receive buffer, but without waiting for the return of the acknowledgment. According to this protocol, the receiving node only accepts the packets in the correct order and sends request numbers (RN) back to the transmitting node along with the flow control information, such as the state of the receive buffer. The effect of a given request number is to acknowledge all packets prior to the requested packet and to request transmission of the packet associated with the request number. The go back number n is a parameter that determines how many successive packets can be sent from the transmitter in the absence of a request for a new packet. Specifically, the transmitting node is not allowed to send packet i+n before i has been acknowledged (i.e., before i+l has been requested). Thus, if i is the most recently received request from the receiving node, there is a window of n packets that the transmitter is allowed to send before receiving the next acknowledgment. In this protocol, if there is an error, the entire window must be resent as the receiver will only permit reception of the packets in order. Thus, even if the error lies near the end of the window, the entire window must be retransmitted. This protocol is most suitable for large scaled networks having high probabilities of error. In this protocol, the window size n is based on the size of the receive buffer. Thus, the transmitter does not send more data than the receiver can buffer. Consequently, at start up, the two nodes must transmit information to each other regarding the size of their buffersxe2x80x94defaulting to the smaller of the two buffers during operation.
In an architecture that permits large data packets, unnecessarily retransmitting excess packets can become a significant efficiency concern. For example, retransmitting an entire window of data packets, each on the order of 4 Gigabytes, would be relatively inefficient.
Other known flow control protocols require retransmission of only the packet received in error. This requires the receiver to maintain a buffer of the correctly received packets and to reorder them upon successful receipt of the retransmitted packet. While keeping the bandwidth requirements to a minimum, this protocol significantly complicates the receiver design as compared to that required by Go Back n ARQ.
The present invention is therefore directed to the problem of developing a method and apparatus for controlling the flow of data between nodes in a system area network that improves the efficiency of the communication without overly complicating the processing at the receiving end.
The present invention provides a method for transmitting data packets from a first endpoint to a second endpoint, either directly or via a fabric. The method of the present invention includes the steps of transmitting the data from a first node in a plurality of packets, and transmitting the data independently of a state of a receive buffer in the second node.
The present invention also provides an apparatus for communicating data between a two endpoints coupled together either directly or via a fabric. The apparatus includes a first switch disposed in a first endpoint, a second switch and a buffer. The second switch can be disposed either in the fabric or in the second endpoint. The first switch transmits the data packets in a plurality of packets from the first endpoint to the second switch independently of a state of a receive buffer in the second switch. The apparatus also includes a buffer located in the first endpoint, which buffer is coupled to the first switch and stores each packet until receiving either an acknowledgment that each packet was successfully received or an error indication that a received version of each packet included at least one error.