In recent years, an exponential increase in the demand for computer storage has been driven by growth in digital information due to faster processors, lower cost of digital data storage, increasing availability of high data rate access, and development of new applications. This increased dependence on computer data has caused a need for more efficient data storage and data migration technology.
Online data migration is the process of transferring data between storage systems in a networked environment. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious and often time-consuming tasks. Data migration may be required when, for example, organizations or individuals change computer systems or when it is determined, by capacity planning, that additional data resources are required for an existing computer system.
Non-disruptive data motion is the process of performing an online data migration that is virtually transparent to a client accessing the data. The client is unaware of the migration process and can access the data throughout the migration process.
A computer network is typically utilized as a transport mechanism for data migration between storage systems. However, due to the large amounts of data copied across the network (often well over several terabytes in an enterprise environment) the duration of the migration process can exceed tolerable levels, in some cases lasting for days or weeks.
Often, after a data migration from a source storage system to a destination storage system, an unforeseen problem emerges, necessitating a reversion (“rollback” of the migration), to the source storage system. For example, one such problem could be that, after the destination storage system is in operation for several hours or days after the migration from the source storage system, an administrator determines that the destination storage system lacks sufficient capacity to store data at a current or future data growth rate. Another problem may relate to unexpectedly slow performance, at the destination storage system, of a migrated application or dataset. In these cases, it is desirable to revert back to utilizing the source storage system, at least until the unforeseen problem is addressed. However, it is not possible to simply redirect the client from accessing the destination storage system to accessing the source storage system, because the data at the destination storage system is modified arbitrarily after the initial data migration based on new data received to the destination storage system. For example, during the initial data migration of a baseline data set from the source storage system to the destination storage system, a user file may be copied. The user file may include of, for example, a document, email, spreadsheet, or another form of electronic information. Once the baseline dataset having the user file is copied to the destination storage system, the user may access the file and modify the file's content. For example, the user may append a graph to the document, reply to the email, or add a new calculation to the spreadsheet. These new modifications are made at the destination storage system, not the source storage system. Therefore, a complete rollback from the destination storage system to the source storage system in the conventional system involves copying both the previously migrated baseline dataset and the user modifications made to the baseline dataset.
Similarly, after a successful rollback migration to the source storage system and after the unforeseen problem of the initial migration has been resolved, a retry of the migration to the second storage system may be desirable. However, as in the rollback migration, the retry migration in the conventional system involves copying a complete dataset of the source storage system to the destination storage system. Following the previous example, after the rollback migration, the user may perform additional modification to one or more user files located at the source storage system. The user may, for example, further modify the graph previously added to the document, receive a response to the email reply, or alter a value used by the spreadsheet calculation. These modifications must be copied during the retry migration to maintain data integrity.
Therefore, the problem of the first data migration being undesirably time-consuming is compounded with the additional time required for the rollback migration and the subsequent retry migration. Together these delays make the prospect of performing a large data migration troublesome at best and, where a large enterprise is concerned, data migrations can be justifiably prohibitive.