There are many kinds of architectures that can be used to implement storage systems. Traditionally, storage for a computing system is implemented using directly attached or integrated storage, such as hard disk drives that are commonly integrated into personal computers. Distributed storage architectures are also widely used, to allow a computer to access and store data on networked based storage devices.
Modern computing systems may also implement storage in the context of virtualization environments. A virtualization environment contains one or more “virtual machines” or “VMs”, which are software-based implementations of a machine in a virtualization environment in which the hardware resources of a real computer (e.g., CPU, memory, storage, etc.) are virtualized or transformed into the underlying support for the fully functional virtual machine that can run its own operating system and applications on the underlying physical resources just like a real computer. By encapsulating an entire machine, including CPU, memory, operating system, storage devices, and network devices, a virtual machine is completely compatible with most standard operating systems, applications, and device drivers. Virtualization allows one to run multiple virtual machines on a single physical machine, with each virtual machine sharing the resources of that one physical computer across multiple environments. Different virtual machines can run different operating systems and multiple applications on the same physical computer.
One reason for the broad adoption of virtualization in modern business and computing environments is because of the resource utilization advantages provided by virtual machines. Without virtualization, if a physical machine is limited to a single dedicated operating system, then during periods of inactivity by the dedicated operating system the physical machine is not utilized to perform useful work. This is wasteful and inefficient if there are users on other physical machines which are currently waiting for computing resources. To address this problem, virtualization allows multiple VMs to share the underlying physical resources so that during periods of inactivity by one VM, other VMs can take advantage of the resource availability to process workloads. This can produce great efficiencies for the utilization of physical devices, and can result in reduced redundancies and better resource cost management.
Storage devices comprise one type of a physical resource that can be managed and utilized in a virtualization environment. A set of one or more virtual disks may be implemented to allow virtualize storage of data on behalf of one or more clients, such as client computers, systems, applications, or virtual machines, where the virtual disk (or “vdisk”) is actually a logical representation of storage space compiled from one or more physical underlying storage devices. When the client issues a write request or read request in a virtualization system, that request is actually issued to a virtualized storage device.
When certain commands are issued by a client to a storage tier, it is often expected that some sort of “commitment” or ‘commit” must occur before an acknowledgement is provided back to the client to indicate successful processing of that command. For example, consider a “write” command that is issued by a client to write a data item to a storage tier in a storage system. After the write command has been issued, the client will be placed into a holding state to wait for a message (or some other indication) that the write command has been successfully processed, which is based upon the data item being persistently placed somewhere within the storage tier. This acknowledgement message is often necessary to ensure that a commit of the data has actually occurred, so that the client can safely proceed with further processing. The persistent writing of the data item is often desired to ensure that the occurrence of a subsequent failure will not result in the loss of data.
The issue is that requiring a commit to occur before allowing the client to further proceed with its processing could cause an undue amount of perceptible delay at the client. This could create a significant amount of unwanted latency. This problem could be even more problematic in a distributed or virtualized system, where there are many kinds of underlying storage devices in the storage tier having differing levels of storage performance. This problem is further exacerbated if there is a need to perform some sort of data replication to provide storage redundancies.
Therefore, there is a need for an improved approach to implement storage which addresses these and other problems with the existing storage systems.