Unless otherwise indicated herein, the approaches described in this section are not prior art to the claims of the present application and are not admitted to be prior art by inclusion in this section.
In the field of computer data storage, filtering refers to the process of manipulating, by one or more objects known as filters, data that is read from or written to a storage device/volume. Examples of manipulation functions that can be performed by such filters include compression, encryption, caching, replication, and so on.
In a virtualized environment, filtering may be performed on I/O reads and writes that are directed to virtual disks (VMDKs). There are a number of ways in which this VMDK I/O filtering can be implemented. According to a first approach, a hypervisor of a host system can include one or more filters within the vSCSI layer of its kernel I/O stack. When a virtual machine (VM) running on top of the hypervisor issues, e.g., an I/O write request directed to a VMDK, the write request can pass through the kernel I/O stack and can be intercepted by the filters in the vSCSI layer. These vSCSI filters can then manipulate the data associated with the write request before it is committed to the storage device storing the VMDK.
According to a second approach, VMDK I/O filtering can be performed in user space, rather than at the hypervisor vSCSI layer. In this approach, each VMDK can be associated with metadata identifying a list of filters that should be applied, in order, to I/O requests directed to the VMDK. When a VM first opens a VMDK, a filter framework component can be instantiated in a user-level process of the VM (e.g., the VM's VMX process). The filter framework component can include the various filters associated with that VMDK. When the VM subsequently issues, e.g., an I/O write request to the VMDK, the hypervisor's kernel I/O stack can pass the write request to the VM's filter framework component in user space, which can cause the filters to be applied to the data associated with the write request. Once the filters have been applied, the write request can be returned to the kernel I/O stack for forwarding to the storage device.
An advantage of performing VMDK I/O filtering in a manner that does not involve the vSCSI layer per the second approach above is that it facilitates filtering of both online (i.e., VM initiated) and offline (i.e., non-VM initiated) I/O. For example, assume an application (e.g., a backup application, a cache de-staging application, etc.) wishes to perform I/O operations directly against a VMDK while its corresponding VM is powered off. Such an application is referred to herein as an offline application. In this case, the filter framework component can be instantiated in the user memory space of the offline application at the point the application opens the VMDK. This instantiation can be performed by, e.g., a “DiskOpen” API that is exposed by the hypervisor and called by the offline application. The offline application can then issue I/O requests to the VMDK using a VMDK-level “IOSubmit” API that take as input a VMDK handle returned by the DiskOpen API. The issuance of the I/O requests to the VMDK using the VMDK-level IOSubmit API (with the VMDK handle as an input parameter) can cause the filter framework component to pass the data associated with the requests through each of the VMDK's filters, thereby ensuring that the request data is filtered before (and/or after) reaching the storage tier. This is not possible with the first filtering approach, since the first approach requires all VMDK I/O to originate from a running VM (and thus pass through the hypervisor vSCSI layer) in order to be filtered.
One problem that may occur when an offline application interacts with a VMDK using the VMDK-level IOSubmit API described above is that certain destructive I/O filters may be inadvertently applied to write data more than once. For instance, consider a scenario in which a VMDK is associated with a compression filter, an encryption filter, and a caching filter, in that order. The compression and encryption filters are destructive in the sense that they modify the input data they are applied to. Assume that a VM interacts with the VMDK for a period of time, which causes write requests originating from the VM to be filtered by this group of filters (i.e., the write data is compressed, then encrypted, then cached in a writeback cache). Further assume that the VM is powered off before the cached data can be de-staged (i.e., propagated from the writeback cache to backend storage), which causes a cache de-staging application to initiate an offline de-staging process.
In this scenario, the write data in the writeback cache is already compressed and encrypted. However, when the cache de-staging application begins copying the write data from the writeback cache to the VMDK using the VMDK-level IOSubmit API, the filter framework component will pass the data through every filter of the VMDK, including the compression and encryption filters. This means that the write data will be compressed and encrypted twice before being committed to backend storage, resulting in data corruption.