Modern-day clients tend to run hybrid workloads where multiple virtual machines are running on one or more local data centers and others on one or more cloud servers. A virtual machine is an emulation of a computer system that, like a physical computer, runs an operating system and applications. A virtual machine has virtual devices that provide the same functionality as physical hardware of a physical computer, and have additional benefits in terms of portability, manageability, and security. A local data center or a cloud server may host one or more virtual machines. Virtual machines are usually backed up by the resources of their host.
A local data center is a facility consisting of networked computers and storages that organizations or other entities own and use to organize, process and store large amounts of data. The local data center is physically assessable to its owner.
Some cloud servers may be owned and operated by third party providers and leased to the end user. Organizations and other entities can sign up as clients on one or more cloud servers. A cloud server enables ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. The cloud server can provide services to organizations or other entities as:
(i) Software as a Service (SaaS)—The clients run the cloud server's applications on the cloud server's computing resources. The applications are accessible from various devices through either a thin client interface, such as a web browser (e.g., web-based email), or a program interface. The client does not manage or control the underlying cloud computing infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.
(ii) Platform as a Service (PaaS)—The clients can deploy their applications onto the cloud server's computing resources. The application can be acquired or created by the clients using programming languages, libraries, services, and tools supported by the cloud server. The client does not manage or control the underlying cloud server's computing resources including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the application-hosting environment.
(iii) Infrastructure as a Service (IaaS)—The clients can provision processing, storage, networks, and other fundamental computing resources. The clients can deploy and run arbitrary software, which can include operating systems and applications on the provisioned resources. The client does not manage or control the underlying cloud computing infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls).
Virtual machines running on local data centers and cloud servers have extensive data security requirements and typically need to be continuously available to deliver services to clients. For disaster recovery and avoidance, the local data centers and cloud servers that provide virtual machine capability need to avoid data corruption and service lapses to clients. Therefore, the local data centers and cloud servers periodically take snapshots of the running virtual machines. A snapshot is a copy of the virtual machine's content at a given point in time. Snapshots can be used to restore a virtual machine to a particular point in time when a failure or system error occurs. The computing resources can take multiple snapshots of a virtual machine to create multiple possible point-in-time restore points. When a virtual machine reverts to a snapshot, current virtual machine's data volumes and memory states are deleted, and the snapshot becomes the new parent snapshot for that virtual machine.
Snapshots are intended to store the virtual machine data for as long as deemed necessary to make it feasible to go back in time and restore what was lost. As the main objective of snapshots is long-term data storage, various data reduction techniques are typically used by a snapshot manager in computing resources to reduce the snapshot size and fit the data into the smallest amount of disk space possible. This includes skipping unnecessary swap data, data compression, and data deduplication, which removes the duplicate blocks of data and replaces them with references to the existing ones. Because snapshots are compressed and duplicated to save storage space, they no longer look like virtual machines and are often stored in a special format. As snapshots just a set of files, the snapshot repository is a folder, which can be located anywhere: on a dedicated server, storage area network (SAN) or dedicated storage in a computing resources' infrastructures.
An opportunity arises for a computer or data management system to keep a snapshot history, stored in sequence, and spanning multiple virtual machines on multiple local data centers and cloud servers, configure scheduling of snapshot capture across multiple systems, and provide improved disaster recovery in the event of data loss due to natural disasters, man-made disasters such as acts of terrorism, and/or virus attacks. To ensure the integrity of the computer or data management system, various functionalities of the system need to be tested in various testing scenarios and environments. Furthermore, development environments need to be provided to the developers of the computer or data management systems. Various testing or development environments need a varying amount of physical resources on the local data centers and virtual resources on the cloud servers. A data management system developer or tester may create his/her development or testing environment. The tester or developer will have the responsibility of locating hardware and installing and configuring one or more software to create the intended environment. Testers and developers will waste precious time on setting up such environments, instead of testing or developing their code. The testers and developers may have to pay significant costs for ongoing management and operation of the created environments, including cleanup of the resources and preparation for the next use.
A tester or developer may also be given a standard environment to test or develop their data management system code. The standard environment may package as many different items into one environment. In some cases, the standard testing or development environments may be adequate. In some cases, the standard testing or development environments may be inadequate to test the functionalities of the data management systems. In some cases, the standard testing or development environments may have some unused resources or resources that are not in use majority of the time.
It is desirable to provide a system that can more effectively and automatically let developers and testers of data management systems acquire a custom environment with resources and characteristics they need.