The present invention relates to data storage systems, and more specifically, this invention relates to selectively distributing data in cloud tiering environments.
The cost per unit (e.g., Gigabyte) of storage is typically higher for higher performance (e.g., faster) storage than it is for relatively lower performance storage. Thus, tiers of storage having different performance characteristics may be grouped together to form a multi-tiered data storage system.
The capacity of a higher performance data storage tier is typically smaller than the capacity of a lower data storage tier in view of their relative prices. In order to maintain an efficient use of the higher performing, yet typically smaller, data storage tier, algorithms may be implemented to relocate data based on a temperature associated therewith. For example, “hotter” data may be migrated towards the higher storage tier (e.g., promoted), while “colder” data is migrated towards the slower tier (e.g., demoted). In the present context, the heat (e.g., “hotter” and “colder”) of data refers to the rate (e.g., frequency) at which the data is updated. Storage blocks that are considered “hot” or “hotter” tend to have a frequent updated rate, while storage blocks that are considered “cold” or “colder” have an update rate which is at least slower than that of hot blocks. Additional factors may be incorporated into the determination of the relative heat of a given portion of data, e.g., such as read frequency. It follows that this promotion and demotion process of data actually relocates the data from one tier to another, and may even be performed without the knowledge of an application that is running.
The different tiers of a multi-tiered data storage system may also be physically separated from each other. For instance, cloud tiering is a feature which provides a cloud storage environment as an extended and remote storage tier option to on-premise filesystems. Conventional implementations of cloud tiering and similar kinds of cloud-as-tier features exploit parallelization to a certain degree, such as parallelizing backup jobs, parallelizing restore jobs, parallelizing encryption jobs, or parallelizing md5 reconciliation jobs.
However, practical results achieved by these conventional implementations have proven cloud tiering to be ineffective, as it causes increased delay in both backup and restore operations. Accordingly, cloud tiering and similar storage system architectures are unreliable.