Traditional enterprise IT infrastructure often includes: (i) servers/storage/networking inside one or more data centers, (ii) servers/storage/networking inside one or more remote or branch offices (ROBO), and (iii) desktop, laptop, tablet, and smartphone equipment for employees. Modern IT may also include enterprise applications running on an external public IaaS, PaaS, or SaaS cloud provider.
FIGS. 1, 2, and 3 respectively depict the typical environments inside data centers today. In all three cases one can come across compute software that directly interacts with software that is used to actually persist data.
For a compute part, an enterprise's or cloud provider's workloads may either be virtualized, containerized, or running on bare metal physical machines (servers). By “virtualized”, one typically means workloads running inside a virtual machine which in turn runs on hypervisor software, which in turn runs inside a physical machine. By containerized, one typically means workloads running inside a container which runs alongside other containers as software inside a physical machine. By bare metal, one typically means workloads running directly on the physical machine without any virtualization or containerization middleware.
For a storage part, which is where the compute workloads store their data, enterprises may use either external proprietary hardware appliances connected to the physical machines running the workloads, or other physical machines that act as storage servers, or may use storage on local disks of the compute physical machines. In the first case, the external proprietary hardware, such as Storage Area Network (SAN) or Network Attached Storage (NAS) appliances, get connected to physical machines running the workloads over a physical interconnect such as Ethernet or Fibre Channel. This is the most common setup in the enterprise IT industry today. In the second case, commodity servers of the same or similar type as servers running the compute workloads, run dedicated software that allows them to appear as a SAN or NAS appliance, and again get physically interconnected to the servers running the workloads with similar interconnects. The widely used term for this setup is “converged infrastructure.” In the third case, dedicated software runs alongside the compute workloads on the compute servers, and turns their locally attached disks into a SAN or NAS appliance that can be used to store data. The widely used term for this setup is “hyper-converged infrastructure.”
For a networking part, machines and storage inside a data center get interconnected via physical cables over routing and switching hardware. Networking inside the data center can use an Ethernet connection between compute servers and, if external storage appliances exist, interconnection of the external storage appliances via Ethernet.
Regardless of whether the compute part is virtualized, containerized, or bare metal, the physical machine running the compute part needs specific software that orchestrates the workloads. In a virtualized environment, this software can be a virtualization platform or hypervisor. In a containerized environment, this software can be a container platform. In a bare metal environment, this software can be the physical machine's Operating System (OS) itself.
Regardless of whether the storage part runs on an external proprietary appliance, on commodity x86 physical machines like the ones running the compute part, or on the compute servers themselves, there is specific storage software that orchestrates the persisting of the data. This software can also provide a number of data services alongside persisting the data, such as replication, caching, compression/de-duplication, backups, thin clones/snapshots, tiering, and QoS.
Enterprise IT needs to be able to run its workloads on different locations, different installations, or different environments altogether. To move workloads one needs to move the underlying data, so that the workload finds the data it needs at the destination. The problem, known as a data mobility problem, is that data is currently confined inside a system that comprises a platform and persisting storage technology. There are several unsatisfactory or limited approaches to the problem, including (i) manually exporting and importing, (ii) requiring the storage products be of the same type, or (iii) by providing a virtualization platform at each of the different storage locations.
The problem is much harder to solve when one operates on thin clones and snapshots of data. This requires having a storage technology that can do instant copies of the data for the platform to consume. If the platform is consuming thin clones and snapshots, then moving these thin clones and snapshots to a different location is not possible unless one uses a storage product of the same type. This happens because the data services logic that creates these clones/snapshots exists inside the software that runs on the specific storage solution and cannot be understood by an external system. Furthermore, the problem cannot be efficiently solved if one needs to be able to replicate to more than one location.