The present invention relates generally to methods and apparatuses for controlling the flow of data between two nodes in a computer network, and more particularly to a method and apparatus for controlling the flow of data between two nodes in a system area network.
As used herein, nodes refer to endpoints of a communication path, i.e., the origination and termination of messages. For example, an origination node could be a request for a data file from a user computer, and a terminal node could be a server coupled to the user""s computer via a network, either public or private, to which the request is directed. All points along the communication path are termed intermediate points, or simply points, for purposes of this application. A link includes two points coupled together.
An existing flow control protocol, known as Stop and Wait ARQ, transmits a data packet and then waits for an acknowledgment (ACK) before transmitting the next packet. As data packets flow through a network from point to point, latency becomes a problem. Latency results from the large number of links in the fabric because each link requires an acknowledgment of successful receipt from the receiving node for each data packet before the next data packet is sent from the transmitting node. Consequently, there is an inherent delay resulting from the transit time for the acknowledgment to reach the transmitting node from the receiving node.
One solution, which is known as Go Back n ARQ, uses sequentially numbered packets, in which a sequence number is sent in the header of the frame containing the packet. In this case, several successive packets are sent without waiting for the return of the acknowledgment. According to this protocol, the receiver only accepts the packets in the correct order and sends request numbers (RN) back to the transmitting node. The effect of a given request number is to acknowledge all packets prior to the requested packet and to request transmission of the packet associated with the request number. The go back number n is a parameter that determines how many successive packets can be sent from the transmitter in the absence of a request for a new packet. Specifically, the transmitting node is not allowed to send packet i+n before i has been acknowledged (i.e., before i+1 has been requested). Thus, if i is the most recently received request from the receiving node, there is a window of n packets that the transmitter is allowed to send before receiving the next acknowledgment. In this protocol, if there is an error, the entire window must be resent as the receiver will only permit reception of the packets in order. Thus, even if the error lies near the end of the window, the entire window must be retransmitted. This protocol is most suitable for large scaled networks having high probabilities of error.
In an architecture that permits large data packets, unnecessarily retransmitting excess packets can become a significant efficiency concern. For example, retransmitting an entire window of data packets, each on the order of 4 Gigabytes, would be relatively inefficient.
Other known flow control protocols require retransmission of only the packet received in error. This requires the receiver to maintain a buffer of the correctly received packets and to reorder them upon successful receipt of the retransmitted packet. While keeping the bandwidth requirements to a minimum, this protocol significantly complicates the receiver design as compared to that required by Go Back n ARQ.
The present invention is therefore directed to the problem of developing a method and apparatus for controlling the flow of data between nodes in a system area network that improves the efficiency of the communication without overly complicating the processing at the receiving end.
The present invention provides a method for transmitting data between switches in a fabric having a plurality of links. The method includes the steps of transmitting the data in a plurality of packets from point to point, and retaining each packet in a buffer at a transmitting point until receiving either an acknowledgment indicating that each packet was successfully received at the next point or an error indication that a received version of each packet included at least one error, while simultaneously transmitting additional packets to the next point. The method further includes the step of indicating successful receipt of all packets between a last acknowledged packet and a particular packet by sending a single acknowledgment.
According to one exemplary embodiment of the present invention, an apparatus is provided for communicating data between two points in a fabric, including a plurality of links. The apparatus includes a first switch that is disposed in a first point of a link. The first switch transmits the data in a plurality of packets from the first point in the link to a second point in the link. The apparatus also includes a buffer disposed in the first node, which buffer is coupled to the first switch. The buffer stores each packet until receiving either an acknowledgment that each packet was successfully received or an error indication that a received version of each packet included at least one error. The apparatus further includes a second switch that is disposed in the second point. The second switch receives each of the transmitted data packets, and upon receipt of an error free packet, the second switch sends an acknowledgment to indicate successful receipt of the error free packet and all packets in sequence between a last acknowledged packet and the error free packet.