For security and reliability, among other reasons, a company may maintain a remote backup data site to provide a data back-up and/or data mirroring facility in the event of loss of data at a primary site due to a disaster. In anticipation of the possibility of a catastrophic disaster, such as a natural disaster, it may be desirable to situate the remote backup site far from the primary site. An example of a storage system that may provide data backup and mirroring capability over a long distances is the Symmetrix Remote Data Facility (SRDF) products provided by EMC Corporation of Hopkinton, Mass. The SRDF system may be implemented using long haul networks that provide for reliable data links over large distances.
For a long haul network, data may be transmitted using protocols that enable the connection, communication and data transfer between computing end-points. For example, TCP/IP links allow applications to communicate reliably over IP packet networks. TCP/IP is a two-layer program. The higher layer, TCP, manages the assembling of a message or file into smaller packets that are transmitted over the Internet and received by a TCP layer that reassembles the packets into the original message. The lower layer, IP, handles the address part of each packet so that the packet is transmitted to the right destination. Each packet may include a checksum, which is a form of redundancy check in which data bits of the packet are added and the resulting value communicated to a receiver. If processing of the packet at a receiver detects an incorrect checksum, the receiver may conclude that the received packet contains errors and request that the transmitter retransmit the packet and/or may request that the transmitter retransmit from a certain byte offset in the stream.
TCP/IP links permit sharing of network bandwidth access connections using congestion-avoidance algorithms. One congestion-avoidance algorithm may be a window-adjustment algorithm that allows a TCP sender to dynamically adjust a transmission window that represents the maximum amount of unacknowledged data that may be in transit in the network at any given time. Window size may be calculated as bandwidth times the round trip delay or latency. In an acknowledgement scheme in which the receiver sends an acknowledge of received packets to the sender, it may take at least one roundtrip time for each packet to be acknowledged. Thus, a TCP sender can safely send up to a window's worth of packets every round trip time. In a long-haul network, the roundtrip time may be high, thereby yielding a reduced sending rate, which may drop even further if the window size is reduced or if dynamic adjustments to the window are made in a suboptimal fashion.
Congestion events may cause a significant reduction in the size of the transmission window. For example, in response to detection of congestion, TCP may cut the window size in half according to a window adjustment algorithm. Other technologies developed in connection with TCP window adjustment algorithms, include, for example, high speed TCP and variants thereof, which provide for the dynamic altering of how the window is opened on each round trip and closed on congestion events in a way that is dependent upon the absolute size of the window.
Long-haul TCP/IP links may be susceptible to packet loss and/or delay that may significantly reduce data transmission throughput. As discussed above, in the event of error detection using checksum, a receiver may request retransmission of a packet. However, in a long haul network, retransmission of packets may cause both latency and bandwidth issues resulting from long roundtrip times and decreased transmission window sizes. Accordingly, error correction techniques may be used to address these issues. Error correction may be performed using forward error correction (FEC) which is a system of error control for data transmission in which the sender adds redundant data to its messages, also known as an error correction code. FEC allows the receiver to detect and correct errors (to at least some extent) without the need to ask the sender for additional data. FEC involves adding redundancy to transmitted information using a predetermined algorithm. Each redundant bit may be a complex function of many original information bits. Two main categories of FEC are block coding and convolutional coding. Block codes work on packets of predetermined size. Convolutional codes work on data streams of arbitrary length. Convolutional codes may be decoded with the Viterbi algorithm, among other algorithms. Block codes may include, for example, Reed-Solomon, Golay, BCH and Hamming codes, among others. A convolutional code may be turned into a block code.
In FEC, a back-channel is not required and retransmission of data may often be avoided, which may be desirable in situations in which retransmission is costly and/or difficult. However, the cost of FEC may be higher bandwidth requirements to account for the transmission of the redundant data. In long haul TCP/IP links having a fixed transmission window size and relatively long round trip times, increased bandwidth requirements may significantly affect sending rates and data throughput. In addition, FEC algorithms that require both the sender and the receiver to be running the same algorithm lack flexibility.
Accordingly, it would be desirable to provide a system for error correction that provides for data reliability while improving data transmission throughput and may be used, for example, in connection with long-haul network communication, such as long-haul TCP/IP links. It would also be desirable if such a system provided flexibility to turn off and/or adjust the algorithm without having to always have the same algorithm running on all of the senders and receivers.