As companies today become more accustomed to storing important company information on their data network, the value of these networks and the data they store continues to grow. In fact, many companies now identify the data stored on their computer network as their most valuable corporate asset.
Today most backup systems operate by having the network administrator identify a time of day during which little or no network activity occurs. During this time the network administrator turns the network over to a backup system and the data files stored on the computer network are backed up, file by file, to a long term storage medium, such as a tape backup system. Typically the network administrator will back up once a week, or even once a day, to ensure that the backup files are current.
Ideally, enterprises would like to perform frequent replications during the business day in order to reduce the data loss risk window. If the size of a replication becomes large, this traffic can consume a large amount of bandwidth, interfering with other business traffic. The result can be that the number of replications is minimized during the business day, or even postponed until after business hours. This increases the data loss risk window. To solve this problem, wide area network (WAN) accelerators can be deployed on the WAN link between data centers and the remote replication site.
All WAN accelerators work on the same principle of intercepting traffic and optimizing it before it traverses the WAN. The optimization is accomplished using two techniques: TCP optimization and compression. TCP optimization ensures that on a high latency network, TCP's protocol inefficiencies do not slow down the transfer. Compression reduces the amount of data that needs to be transmitted across the WAN.
For compression, WAN accelerators fall into two categories: memory-based systems and disk-based systems. Memory-based WAN accelerators typically pass the data through a compression algorithm and compress it on the fly. Since they have a limited compression database, they achieve a lower compression ratio. More advanced systems are disk-based; they can store gigabytes of data patterns on disk, and when data passes through they see whether the data pattern has been seen before. If it has, they send a token to represent it and recreate it on the device on the other side of the WAN. Disk-based systems get a much higher compression ratio, since a single token can represent gigabytes of data and enable a significant reduction in the amount of data being replicated. This results in reduced WAN bandwidth requirements. This method of deduplication associated with disk-based compression of data across a WAN is more efficient than typical block or file-level deduplication deployed in backup and replication products