Data migration between geographically distributed data centers is a critical task in modern large-scale computer networks. Due to growing amounts of transmitted data and limited throughput of channels traditional protocols like TCP (Transmission Control Protocol) are becoming outdated. TCP has proven to be very successful and greatly contributes to the popularity of today's Internet and still contributes the majority of the traffic on the Internet. However, TCP is not perfect and it is not designed for every specific application. In the last several years, with the rapid advance of optical networks and rich Internet applications, TCP has been found inefficient as the network bandwidth-delay product (BDP) increases. Though its AIMD (additive increase multiplicative decrease) algorithm reduces the TCP congestion window drastically, it fails to recover it to the available bandwidth quickly, and theoretical flow level analysis has actually shown that TCP becomes more vulnerable to packet loss as the BDP increases. Thus, the Internet transmission protocols must be optimized to maintain viability in heavy data traffic environments. Current methods of optimization however, often involve changing applications to accommodate different transmission protocols. This limits data mobility and imposes great cost overheads for system administrators.
FIG. 1 illustrates present methods of transmitting data in large network systems using certain known transmission protocols. In system 100 of FIG. 1, different networks 102-106 communicate to each other over a wide area network (WAN) 110. Each individual network 102-106 may be a local area network (LAN) or other similar type of network comprising any number of server and/or client computers that are coupled together and then coupled to the WAN 110 through a gateway or network interface device. The networks 102-106 may represent data centers that include large-scale storage devices or storage networks and one or more servers to process the data to be stored and retrieved. In present systems, the standard communication of applications and data transfer among networks, such as 102 to 106 is performed with the TCP/IP protocol 112. The TCP protocol generally does not run well on WANs when either latencies or packet drop rates are high, such as due to distance, bad network connections, congestion, and other similar factors.
Other protocols have been developed to overcome the deficiencies of standard TCP/IP, such as the BURST protocol from EMC Corporation. BURST is a replacement protocol for TCP that has proven to be reliable. It is built on top of the User Datagram Protocol (UDP) and is biased towards Big Data transfers, and was developed to overcome TCP's inefficiency in high bandwidth-delay product (BDP) networks with random losses. As shown in FIG. 1, this known alternative protocol 114 to TCP/IP 112 comprises the BURST layer on top of the UDP layer over the IP layer. The UDP layer is a connectionless protocol that emphasizes low-overhead operation and reduced latency in favor of error checking and delivery validation. Traditional approaches 112 that use TCP are generally not efficient enough to transmit big data volumes between datacenters, e.g., 102 to 106. The use of an alternative, more efficient protocol 114, such as BURST often requires application changes, additional work and, in many cases may be infeasible to implement, such as if an application cannot be changed.
What is needed therefore, is a way to provide a transmission protocol without requiring changes in the applications so as to significantly improve data mobility, which is extremely important for big data stores synchronization and backup. Such a solution may be provided through the usage of standalone software module based on EWOC and implementing base TCP APIs (application programming interfaces) for invasive substitution of a standard operating system network modules.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. EMC, BURST, and EWOC are trademarks of EMC Corporation.