In a cloud storage platform, a customer can generally request that storage be allocated for use by a computing instance (e.g., a virtual machine (VM)) so that the VM can make storage requests (e.g., read/write requests) to or from the allocated storage. In this scenario, the ability to allocate the requested storage is based on an assumption that the requested storage is available for allocation. While the cloud ideology advertises the availability of “infinite” resources that are available on-demand, in practice, storage capacity is large, but limited, and storage capacity cannot dynamically scale if demand surges faster than the supply of storage resources. For example, the volume of data using up cloud storage resources is growing at a dramatically fast rate, from Terabytes (TBs) to Petabytes (PBs) daily, and in some cases even hourly. Meanwhile, the supply chain for provisioning physical servers and storage devices (e.g., disks) in data centers typically comes with a lead time of 3 to 6 months. Limitations in the expansion of physical capacity are further compounded by limitations in the extent to which storage resources can be practicably managed (e.g., storage account set up and provisioning for a large number of tenants) and monitored for optimization purposes (e.g., monitoring storage usage, demand, and/or availability metrics to make decisions regarding reallocation of storage).
Currently, a technique called “thin provisioning” is commonly used to allocate storage for customers of a cloud storage platform. Thin provisioning is a way to conserve storage resources by assigning only what is currently needed to support an allocation of storage instead of allocating the full amount of space requested for a VM. The tenant gets an acknowledgement of the full amount of storage requested, but the tenant's VM is not allocated additional storage of the requested amount until demand for that additional storage arises. The advantage of thin provisioning is to reduce wastage (i.e., avoiding allocation of storage that is not used) and to reduce costs via statistical multiplexing (i.e., by oversubscribing customers to a logical address space that is larger than the physical storage actually available, which allows for more of the available physical storage to be used and paid for by customers).
However, thin provisioning comes with drawbacks. For example, if thin provisioning is implemented incorrectly, an “out-of-space” condition can occur where customers run out of physical storage space. This “out-of-space” condition can significantly degrade performance (e.g., storage requests, like read/write operations, get stalled or can take much longer to complete) and can also negatively impact the availability of resources (e.g., VMs, containers, storage nodes, etc.).
Furthermore, current cloud storage schemes charge customers based solely on how much storage they are using (e.g., a tenth of a cent per Gigabyte (GB) of storage used), and the only service level agreements (SLAs) that are provided to tenants are based solely on “availability.” For example, current SLAs can specify a 99.99% availability of the tenant's data, which can represent that the tenant's requests for his/her data will fail 0.01% of the time over a given time period. In this scenario, the tenant's data will not be lost, but rather access to the data can be temporarily disrupted. However, when it comes to performance of a tenant's applications at runtime, only best effort performance is currently provided by the cloud storage platform, which can lead to the above-noted performance degradations.