In a packet switched network, information is communicated in the form of units (commonly called packets or frames) which are self-contained with respect to the delivery process. In other words, each unit carries sufficient information for it to be delivered to the intended recipient, or recipients. Thus, each packet carries a destination address as well as possibly a source address, which is a necessary ingredient for delivery.
A packet switched network can be broadly said to include end-station nodes linked to intermediate nodes, which “switch” packets received from neighboring nodes, connected to the switch “ports,” out on different ports or possibly out on the same port in case of a “hair-pin” turn, according to the destination address of each packet. The “hair-pin” turn is used in conjunction with virtualization and communication between virtual interfaces (or guest Operating Systems) on the same physical server or that use the same physical NIC.
A considerable body of work has been associated with switch architecture, leading to higher performance and improved switch design. Typically, high performance switches today implement non-blocking architectures in that, internally to the switch, a blocked port does not affect traffic going to other ports. A commonly used non-blocking architecture is output queuing, where at each port, ingress packets are placed in queues according to the egress port onto which the packets are to be transmitted. The non-blocking behavior is achieved by preventing packets going to congested ports from entering the actual switching fabric.
It is common that packet switched networks are prone to congestion, typically because they lack access control: a network node does not usually need to ascertain that there are sufficient resources available in the network before transmitting a packet. In contrast, networks where resources are reserved along the path between communicating endpoints do not suffer from this problem. (These are typically called “circuit switched” or “virtual circuit switched” networks.). However, this is achieved at a cost of circuit setup time or resources that remain unused when a circuit is not busy.
For this reason, a principal advantage of packet switched networks compared to circuit switched networks is that they tend to achieve higher utilization efficiency. This is particularly the case for data networks, where traffic load tends to be highly variable over time (bursty).
Network congestion has known detrimental effects on packet-switched networks. In general, network nodes tend to drop packets when they experience congestion. Dropped packets in turn can lead to service quality degradation as perceived at the network application level. In cases where reliable transfer is desired, dropped packets result in wasted resources, increased delays and decreased performance as the dropped data needs to be retransmitted from the source.
Hop-by-hop flow control is known. Back-pressure signals are exchanged between neighboring nodes on a link to suspend or resume transmission on the link. Hop-by-hop back-pressure can alleviate the effects of congestion by spreading the packet buffering requirements over multiple nodes. However, it does not extend the non-blocking property of a switch to the paused neighbors and, therefore, may result in congestion in one node spreading through the network.
An IEEE standard for flow control in Ethernet networks specifies the format of a packet (frame) which can be transmitted by an Ethernet node to request transmission of packets to be suspended (paused) at neighboring nodes (hop-by-hop) for a specified period of time. This PAUSE frame can also be used to resume transmission when desired. Recent work in the IEEE on improving the discrimination capabilities of PAUSE has focused on segregating between a limited number of traffic types or “classes”, which has no impact on the problem of congestion propagation within a certain traffic class.
For example, FIG. 1 is a block diagram illustrating a simple example of such a network. Referring to FIG. 1, Switch1 is shown as receiving a Pause indication 108 from another switch (not shown). A corresponding pause indication 106 is provided from Switch1 to Switch0. Finally, a yet still corresponding pause indication 104 is provided to NIC 102. The NIC 102 is, for example, a network interface controller configured to offload TCP/IP processing from a host. The pause indication 104 provided to the NIC 102 is an indication to the NIC 102 to suspend transmission of packets to Port0 of NIC 102 for a specified period of time.