Many enterprises use hosted or on-premises computing platforms (e.g., towers, racks, backplanes, blades, virtual machines, etc.) to handle enterprise computing needs. The enterprise may also use such hosted or on-premises computing platforms for storage (e.g., direct access storage, networked storage, etc.) to handle persistent storage of various forms of enterprise data. Enterprises tend to provide such computing infrastructure to serve department-level, e-commerce, and management information system needs, and/or mission-critical operations. In some cases, the operations of an enterprise may see widely-varying demand for computing and storage. In such cases, the enterprise may want to use cloud-based services so as to pay only for actual use during such periods of higher demand. For example, a large retailer may have plenty of on-premises computing resources for day-to-day operations, however during certain periods such as a global online rollout of a new product, the traffic at the retailer's website might be many hundreds or even thousands of times greater than is seen under day-to-day conditions. In such a case, the retailer might want to use cloud-based services to handle the transient loads.
Unfortunately, although the computing load can be distributed between the on-premises equipment and cloud-based equipment, the data, and hence the persistent storage, often needs to be available to, and updated by, both the on-premises computing operation as well as the cloud-based computing operations. For example, although the aforementioned retailer can offload website middleware to the cloud so as to handle a large amount of traffic, the retailer would need to make catalog databases and order databases available to the cloud-hosted middleware. This example highlights the situation where data that is normally handled as on-premises data needs to be accessed (e.g., in READ/WRITE scenarios) by computing operations within the cloud. One legacy approach is to move the data or databases (e.g., the order database) in their entirety to the cloud during the period of high demand, and then bring it back (e.g., with updates) to be restored in on-premises storage after the period of high demand has passed. However, such a legacy approach introduces risk and has operational limitations.
Another legacy approach is to keep the database at the on-premises site and access the on-premises data from the cloud over a network, however such a legacy approach has severe performance limitations. Yet another approach is to move a copy of the database or databases to the cloud, execute over the cloud-based copy, and keep track of changes made to the copy (e.g., by capturing block-by-block changes or by taking periodic snapshots). Unfortunately, legacy techniques that are used to keep a change-by-change up-to-date on-premises replication of the cloud-based copy brings about voluminous traffic between the cloud and the on-premises equipment, making such an approach impracticable. Further, legacy techniques fail to provide sufficient resilience; even a momentary power outage or network outage can be disastrous.
The hereunder-disclosed solutions implement change block tracking in mixed environments involving cloud-based computing environments working in conjunction with on-premises equipment and/or hosted equipment environments. The disclosed solutions address long-felt needs. For example, legacy techniques that scan the entire logical address space between snapshots and compute the block level changes are proven to be very inefficient; the technique is resource-intensive and is very slow. What is needed is a technique or techniques to improve over legacy approaches.