The explosion of digital content and its critical role in business processes has increased the need to protect sensitive data via data replication. Replication generally involves copying data from one location to another location, where the term “location” may mean a storage entity, a geographic location, a form of media, and so on. Replication may mean copying data from one storage entity to another within the same geographic location, for example. Replication may mean copying data from one geographic location to another geographic location. A transfer agent (TA) is an entity, typically a server, that performs the replication. In addition to making backup copies of data, another instance in which it may be desirable to replicate data from one storage location to another is referred to as cloning, where point in time copies are made of data in one location and stored in another location. Cloning is useful in applications where it is desirable to preserve snapshots of data at different times.
Regardless of the application, massive amounts of data being replicated may overwhelm the computing capacity of a single transfer agent. Typically, many transfer agents are required to complete the replication task. An even distribution of the replication load may minimize the time to complete the replication of a file server. It is a non-trivial problem to spread and allocate the work in a way that evenly distributes the burden among the available transfer agents.
Typical data replication environments may copy the contents of a source file server to a destination file server. A common methodology to replicate data between file servers is host-based replication, in which the transfer agent may be external to both source and destination file servers. The transfer agent reads data from the source file server and writes it to the destination file server. File servers are containers of potentially many file systems of varying complexity and characteristics. Given that a transfer agent has finite computational and file transfer capacity, a single transfer agent can replicate only a limited amount of data or number of files in a given window of time. As a result, it is often the case that a non-trivial replication of a file server may require many more than a single agent. Faced with the task of replicating data from one file server to another, one challenge is to make efficient use of the transfer agents that move the data from source to destination, and in particular to optimally distribute the work among the various transfer agents that are available.
Non-optimal allocation of the replication work may cause some agents to be overloaded while other agents are idle. As a result, time to completion may be longer, or more agents may be engaged, than are optimally necessary, both of which may translate into greater time or expense. For example, suppose there are 100 file systems to be copied between two file servers and that a simplistic allocation is used in which every transfer agent is responsible for replicating a single file system. It follows that 100 agents are needed to complete the replication task. If some file systems are large and others are small, using this methodology may result in some agents quickly completing their replication subtasks while others are still hard at work processing large file systems. It also follows that the overall replication of the entire file server is not complete until the last agent finishes the transfer of the most complex and largest file system. Although this method completes the replication task in reduced time, it requires many hardware resources and makes inefficient use of those resources. These inefficiencies are manifested in increased costs, since each transfer agent adds to the hardware and software cost of the whole solution.
In addition, the transfer agents may have unequal capabilities. Typically, transfer agents have hardware configurations that may differ in numerous ways, such as number of CPUs, CPU speed, memory size, bus speed, and so on.
Thus, given a set F of file system entities (f1, f2, f3, . . . ) and a set T of transfer agents (t1, t2, . . . ) there exists a need to map the set F to the set T such that the replication time of F is minimized.