Distributing large amounts of data across a geographically disperse Wide Area Network (WAN) has become commonplace in today's economy. Organizations may have systems and networks that span across the entire globe. Moreover, organizations and governments are increasing capturing and analyzing more and more data within their systems.
One system may require data from another system that is located in a distant geographical location. Further, one or more systems may be redundant and thus configured to stay in synch with one another. Systems may also be backed up to other systems.
One problem associated with distributing large amounts of data across a WAN or for that matter any network (e.g., Local Area Network (LAN), Metropolitan Area Network (MAN), etc.) is that network nodes, connections, routers, hubs, and the like have limited amounts of bandwidth. Also, the network resources may be concurrently processing a variety of other network transactions.
To address this problem, the networking industry has taken a variety of approaches or combination of approaches. One technique is referred to as Quality of Service (QoS) where software and hardware cooperate to prioritize network bandwidth and transactions and re-distribute or pre-allocate resources based on priorities and known resource limitations. Another approach is to reserve a certain amount of dedicated resources to handle certain transactions. These dedicated resources may always be available for certain transactions or consistently available on certain times or calendar days.
Although in certain circumstances these approaches are valuable, they kill have not provided a solution for adequately distributing large amounts of data across a network. This is so because data backups and replication still largely rely on direct connections to perform backups or replication. That is, direct routes are established within a network for distributing data during a backup or replication from a source node to a destination node. As a result, any intermediate node along a path of a direct connection in route to a destination node has little to no control over where the data is to be sent next within the route, since the route is largely static. The intermediate nodes are not intelligent and not flexible and therefore cannot adjust the distribution of the data in ways that may be more efficient.
Also, because conventional data backup and replication primarily implement direct connections with static routes, there is little opportunity to aggregate the data at intermediate locations within the route from the source node to the destination node. Aggregation has a variety of benefits, such as multicasting data to a plurality of destination nodes, recovery should a destination node or connection fail during a backup or replication operation which does not need to be addressed by the source node, etc.
Furthermore, it becomes challenging to implement network-wide QoS techniques during conventional backups or replications because each node within the network may be independently accessing the destination node with a variety of other network transactions. It is difficult, because the intermediate nodes have virtually no control over how and when the data is distributed along to the destination node. Thus, with existing data backup and replication techniques even if the intermediate node is capable of performing QoS operations, that intermediate node cannot effectively deploy QoS features for the backup or replication.
Therefore, there is a need for improved data distribution techniques, which may be particularly useful for data backup and data replication operations over a network.