The creation and storage of digitized data has proliferated in recent years. Accordingly, techniques and mechanisms that facilitate efficient and cost effective storage of large amounts of digital data are common today. For example, a cluster network environment of nodes may be implemented as a data storage system to facilitate the creation, storage, retrieval, and/or processing of digital data. Such a data storage system may be implemented using a variety of storage architectures, such as a network-attached storage (NAS) environment, a storage area network (SAN), a direct-attached storage environment, and combinations thereof. The foregoing data storage systems may comprise one or more data storage devices configured to store digital data within data volumes.
Digital data stored by data storage systems may be frequently migrated within the data storage system and/or between data storage systems, such as by copying, cutting and pasting, replication, backing up and restoring, etc. For example, a user may move files, folders, or even the entire contents of a data volume from one data volume to another data volume. Likewise, a data replication service may replicate the contents of a data volume across nodes within the data storage system. Irrespective of the particular type of data migration performed, migrating large amounts of digital data may consume appreciable amounts of available resources, such as central processing unit (CPU) utilization, processing time, network bandwidth, etc. Moreover, migrating digital data may involve appreciable amounts of time to complete the migration between the source and destination.
In order to avoid poor user experience and other issues associated with latency in the data migration process of a particular data structure, some storage systems have begun implementing techniques which declare the data migration of a data structure complete, and facilitate access to the digital data of the data structure substantially as if the data migration were complete, prior to actual completion of the data migration. For example, data blocks of the data structure may be migrated as a background task and data blocks may be fetched from the source data store as needed on demand of the data consumer. Accordingly, it may appear to users and other consumers of the digital data that the data migration of the data structure has completed very quickly, thereby avoiding poor user experience and other issues associated with data migration latency. Moreover, by implementing such anachronistic or time-displaced data migration techniques, consumption of available resources by the data migration process may be controlled, such as through judicious operation of background data migration operations.
Although providing advantages with respect to user experience, data availability, available resource consumption control, etc., the foregoing time-displaced data migration techniques may present issues with respect to various data management processes. For example, a point-in-time image process may be utilized to create a read-only copy of all the digital data stored by a data volume or other data structure at a particular point in time (e.g., for use in performing digital data backup operations). Such a point-in-time image of a data store which is a destination of a data structure in a time-displaced data migration process that has not completed the data migration may present may result in data management issues. In particular, a paradox may be presented in which an otherwise apparently migrated data structure has portions of the digital data which are not present in the destination data store (referred to herein as “absent data”), but instead remain upon the source data store. Accordingly, a point-in-time image of the digital data stored by the source data store may also be needed in order to provide backup data which facilitates data access matching that of the digital data being backed up (e.g., allowing for absent data blocks being fetched from the source data store as needed on demand of the data consumer). That is, in order to provide the same accessibility with respect to absent data, both the source and destination data stores of a time-displaced data migration would be needed (i.e., the source would need to exist for as long as the destination existed). It can be appreciated that where time-displaced data migration techniques are performed across multiple sources and destinations within a data system, point-in-time images and other processes providing for data at particular points in time may be incomplete without other data sources, or even caught in deadlocked situations with other data sources in the presence of absent data.