1. Field of the Invention
This invention relates to computer systems and, more particularly, to the management of network traffic comprising large data transfers between computer systems.
2. Description of the Related Art
In today's enterprise environments, more and more applications rely on bulk data transfers to accomplish their functionality. Applications that require large amounts of data to be transferred between one network endpoint and another for a single application-level job or task may include, for example, storage management applications of various types, such as backup and restore applications, disaster recovery applications and the like, media server applications that may be required to transmit movie and audio files, telephony (e.g., voice over IP) and other telecommunications applications, scientific analysis and simulation applications, geographically distributed software development projects, and so on. The amount of data that has to be transferred for a given job or task varies with the specific applications and use cases, but can easily reach several tens of megabytes or even gigabytes in some cases. Furthermore, the data often has to be transferred over large distances: e.g., a disaster recovery application may be configured to replicate data from a primary data center in the United States to another data center in Europe or Asia.
As the emphasis on interoperability, vendor-independence and the use of standards-based technologies for IT infrastructures has increased, most of these bulk data transfers are performed over networks that employ long-established network protocols such as TCP/IP (Transmission Control Protocol/Internet Protocol). The most commonly used networking protocols in both public networks and private networks today, including TCP/IP, were developed before large bulk data transfers became as common as they are today. As a result, these protocols, at least in default configurations, are typically not optimized for large bulk data transfers, and in fact exhibit some behaviors that may be strongly detrimental to bulk data transfer performance.
TCP/IP, for example, implements a variety of built-in responses to apparent or actual error conditions that can negatively impact bulk data transfers. When one or more packets of a given data transfer are lost or “dropped”, which may occur for a variety of reasons, TCP/IP automatically adjusts one or more parameters such as transmit window size, retransmission timeout values, etc. to reduce the likelihood of cascading errors. In addition, a packet loss or an out delivery of a packet may lead to an automatic retransmission of all the packets that were sent after the lost or out-of-order packet. A substantial amount of processing and/or bandwidth may have to be used just for retransmissions caused by an occasional packet loss or out-of-order delivery, even though some of the original packets sent after the lost packet may have already been received successfully at the destination. Although some techniques (such as Selective Acknowledgment (SACK) mechanisms) have been proposed to reduce the impact of unnecessary retransmissions in TCP/IP, in practice these techniques have had limited success, especially in environments where a sender may be capable of injecting packets into the network fairly rapidly, relative to the time taken by the packets to traverse the network to the intended destination (thus leading to a large number of in-flight packets on a given connection). Packet loss may occur for a number of reasons: e.g., due to temporary micro-congestion at a network device such as a switch caused by near-simultaneous reception of a large number of data packets, as a result of suboptimal routing decisions, as a result of misconfiguration of network equipment (such as Ethernet duplex mismatch), or as a result of faulty wiring or equipment on one or more network paths. In response to the packet loss, the transmit window size may be reduced automatically by the protocol, and retransmission timeouts may be increased. In some cases, especially in the event of a number of consecutive packet losses or multiple packet losses that are closely spaced in time, the parameters may be modified to such an extent (e.g., a transmit window size may be reduced to such a small value) that the data transfer may in effect be stalled, with very, little data actually being transmitted. Substantial reductions in throughput may occur even if the packet losses were transient, i.e., even if the network recovers fairly rapidly from the events or conditions that led to the packet loss. In many of these cases, even after the conditions that led to the packet loss no longer hold, it takes the networking protocol a substantial amount of time to recover and adjust parameters such as window sizes to values that are appropriate for bulk data transfers. During these recovery or “self-healing” periods, bulk data transfers are often effectively blocked, which can result in timeouts or other apparent errors in application-level protocols (such as backup or replication protocols), potentially requiring large application jobs to be abandoned and restarted.
A number of different approaches to tuning network traffic have been considered. Some such schemes either require changes to standard network software stacks or require custom hardware; however, such schemes are difficult to implement in environments that rely on standards-based and vendor-independent communication technologies. Techniques that require substantial changes to legacy applications or third-party applications are also unlikely to be deployed in most enterprise environments.