Data migration can add more traffic to the source network than what it might normally experience, which may strain the source servers (e.g. by having to retrieve large amounts of archived data, such as old email messages). This strain can inhibit the flow of data across the source network, or cause one or more of the source servers to crash. Depending on the network lag or downtime that is caused by the migration, the effect on end users may range from a slight, to complete loss, of productivity until network conditions return to normal. For a business, the resulting reduction or loss of customer engagement and revenue can be devastating. And for those tasked with managing the migration, many hours may be spent diagnosing and solving problems. Therefore, the prevention and/or swift mitigation of these data migration-induced problems is of utmost importance.
Contributing factors to the above-described problems may be transient or non-transient. Transient issues are temporary ones that may be solved if certain environmental aspects of the source network are modified, e.g., by decreasing the number of concurrent migrations. Examples of transient issues include the increased loads experienced by the source server when data is migrated from it, and when end users make server requests through normal daily operation.
Non-transient limitations are rooted in the design and architecture of the source network itself and are thus more difficult to avoid through simple changes in the source network's environment. Examples of non-transient limitations include the bandwidth and load limitations of the source server.
One common solution to load-related problems stemming from transient and non-transient issues present during data migrations is to perform load balancing, which distributes workloads across multiple computing resources. Load balancing aims to maximize data throughput, minimize response times, optimize resource use, and avoid overload of any one of the resources.
Current methods of load balancing during a migration, however, are reactive, manual and are not scalable. One such approach is, when an issue arises, a server that controls the migration pings a database primarily used to track data migration orders and user account information (hereinafter referred to as an “order database”). The order database in turn sends alerts to one or more people tasked with managing the migration (hereinafter referred to as a “partner”) via a web-based or local application. Upon notification, the partner then attempts to classify the problem and solve it based on their knowledge and experience with the source system in question. These methods used in classifying and solving the problem lack reliability, however, because it can be difficult to know why a particular solution worked or did not work. The lack of visibility into the source system's health and capabilities also makes it difficult to determine what its baseline or “normal” operating conditions are, and therefore what conditions should be aimed for when adjusting loads to mitigate transient issues. And, there is a great variety in server and network configurations, as well as error messages, which make it particularly difficult to classify the root cause of problems that occur and then apply the correct solution.
Therefore, it is desirable to provide systems and methods that address these and other problems.