1. Field of the Invention
This invention relates to computer systems and, more particularly, to the management of network traffic between computer systems.
2. Description of the Related Art
In today's enterprise environments, more and more applications rely on bulk data transfers to accomplish their functionality. Applications that require large amounts of data to be transferred between one network endpoint and another for a single application-level job or task may include, for example, storage management applications of various types (such as backup and restore applications, disaster recovery applications etc.), media server applications that may be required to transmit movie and audio files, telephony (e.g., voice over IP) and other telecommunications applications, scientific analysis and simulation applications, geographically distributed software development projects, and so on. The amount of data that has to be transferred for a given job or task varies with the specific applications and use cases, but can easily reach several tens of megabytes or even gigabytes in some cases. Furthermore, the data often has to be transferred over large distances: e.g., a disaster recovery application may be configured to replicate data from a primary data center in the United States to another data center in Europe or Asia.
Much of the data for these bulk data transfers is often transferred at least in part over public networks such as the Internet, especially as budgetary pressures have reduced the ability of many enterprises to deploy proprietary high-speed networks and/or non-standard protocols. As a result, the network paths taken by the data packets corresponding to a given bulk data transfer often include a variety of network devices (e.g., switches, routers, etc.) and/or links over which the sending endpoint does not have control. In fact, the sending endpoint typically relies on a standard communication protocol (such as the Transmission Control Protocol/Internet Protocol (TCP/IP)) for routing packets and has very limited knowledge of the specific paths taken by the data packets on their way to the destination endpoint. The sending application simply hands over data packets as fast as possible to the networking software stack at the sending endpoint, and the networking software stack typically transfers the data as fast as possible to a hardware device (e.g., an Ethernet network interface card or NIC) connected to the network. The data packet sizes supported by the network hardware device are often relatively small compared to bulk data transfer sizes: e.g., in a typical implementation where the size of an Ethernet packet is limited to about 1500 bytes, a 64-kilobyte chunk of application data may be transferred as a burst of about 40 very closely spaced packets. The burst of packets may be followed by a gap (e.g., representing time needed by the application to transfer an additional chunk of data to the networking stack and/or to receive an application-level acknowledgment from the receiving endpoint), followed by another burst, and so on, until the bulk data transfer is eventually complete. As a result, the sending device for a given bulk data transfer often transmits data in a bursty pattern.
Due to the decentralized nature of most common network protocols, a given network resource such as a router or switch is often not aware of the application to which a particular data packet received by the device belongs. The network device does not distinguish between the packets of different data transfers: it simply sees sequences of incoming packets on each of its input ports, determines the next link on each packet's route by examining the packet header and sends the packet on over the next link using an appropriate output port. Often, network devices may have limited input and/or output buffer space: e.g., in some implementations, each port's input buffer on a network switch may be limited to buffering 40 packets at a time, and each port's output buffer may be limited to 70 packets. Because the network devices can participate in multiple concurrent data transfers, the bursty nature of the packet streams emitted by sending endpoints can sometimes temporarily overwhelm the resources of a network device, resulting in packets being dropped or lost. For example, if two bursts of more than twenty data packets each happen to arrive on the same port at a particular switch that can buffer at most forty packets in a given input buffer, some of the packets may be dropped. Similarly, if more than seventy packets need to be buffered for output from a given port whose output buffer capacity is limited to seventy packets at a time, some of the outgoing packets may be dropped. Such micro-congestion, even though it may only be a local and transient phenomenon, and even though the network as a whole may have a relatively low level of utilization, can have potentially far-reaching effects on the bulk data transfers, since networking protocols such as TCP/IP react to data packet loss by automatically throttling the data transfer, adjusting parameters such as window sizes and the like.
A number of different approaches to tuning network traffic have been considered. Some such schemes either require changes to standard network software stacks or require custom hardware; however, such schemes are difficult to implement in environments that rely on standards-based and interoperable communication technologies. Techniques that require substantial changes to legacy applications or third-party applications are also unlikely to be deployed in most enterprise environments. Other techniques attempt to implement global solutions that cannot adapt rapidly to changes in the current state of a given data flow: e.g., some schemes may attempt to statically partition bandwidth between different applications, but may still not be able to avoid the transient micro-congestion phenomena and the resulting problems described above.