Some current data centers and private enterprise networks run server virtualization software on compute nodes. These compute nodes generate large amounts of network traffic that includes traffic originating from the virtual machines, as well as infrastructure traffic. Infrastructure traffic is traffic that originates from the host machine layer rather than a particular virtual machine implemented on the host machine.
Currently some networks send traffic as individual packets of data. A data item larger than an individual packet is broken down into multiple packets, each packet is then sent over a network to a destination system (e.g., a computer or virtual machine). When the packets reach their destination, the data in the packets is reassembled to recreate the original data item. In current systems, a packet is not guaranteed to reach its destination. Therefore, for each packet successfully received, the destination system sends an acknowledgement message back to the source address of the packet. The acknowledgement message alerts the original sender that that packet has been received. When a source system sends a packet that is lost in transmission (e.g., the packet is sent to a malfunctioning or busy intermediate system), the destination system does not send an acknowledgement message for that packet. The sending system is set up under the assumption that an unacknowledged packet was lost in transmission. Accordingly, when a threshold amount of time passes after a packet is sent, without the sending system receiving an acknowledgement message, the sending system re-sends the packet. In some network systems the threshold time is based on the round trip time between the sending and receiving systems. That is, in some cases the allowable threshold is the time for the packet to travel from the source system to the destination system, plus the time for the acknowledgement message to be generated and travel back to the source system, plus some buffer time to account for reasonable delays.
When a source system and destination system are geographically distant, the round trip time could be hundreds or thousands of milliseconds. The round trip time is great enough that it would be very inefficient to send one packet, and then wait for acknowledgement of that packet before sending the next packet. Accordingly, many packets are sent while waiting for the acknowledgement message for the first packet to arrive. The sending of many packets while waiting for an acknowledgement message to arrive causes problems when part of the transmission path between the systems is congested. Various networking links between systems have a limited memory capacity and serve as part of the path for multiple source and destination systems. When the memory capacity of an intermediary system is full or too close to full, the intermediate system will start to drop packets or refuse new packets, in some cases causing other intermediate systems to drop packets. In some cases an intermediary system refusing packets causes a great enough delay that a source system re-sends the packets. The re-sent packets can further increase congestion, making the original problem worse.
In some networking systems, when a threshold number of acknowledgement messages are missed within a particular amount of time, the source system determines that there is congestion on the path the packets are taking The source system then slows down the rate of packet transmission in order to allow the congestion to clear. However, when the round trip time (for packet and acknowledgement message) is long, many packets can be sent out before the source system recognizes that congestion is an issue. This causes inefficient retransmission of packets that will be stopped by congestion and/or make the congestion worse. The inefficiency is particularly great when the point of congestion is within the same host machine as the source system (e.g., the congestion is at one or more layers of a set of network transmission layers on a kernel of the host machine) and the destination machine is far away. That is, the traffic congestion is at the beginning of the transmission path, but the round trip time is long and therefore the threshold for determining that packets are being lost is correspondingly long. In such cases, it takes a long time to identify that there is congestion, and many packets are sent at a high rate during that time.