A network storage controller is a processing system that is used to store and retrieve data on behalf of one or more hosts on a network. A storage controller operates on behalf of one or more hosts to store and manage data in a set of mass storage devices, such as magnetic or optical storage-based disks, tapes, flash memory, etc. Some storage controllers are designed to service file-level requests from hosts, as is commonly the case with file servers used in a network attached storage (NAS) environment. Other storage controllers are designed to service block-level requests from hosts, as with storage controllers used in a storage area network (SAN) environment. Still other storage controllers are capable of servicing both file-level requests and block-level requests, as is the case with certain storage controllers made by NetApp, Inc. of Sunnyvale, Calif.
Some network storage controllers can provide so-called “thin provisioning” capabilities to storage volumes of the storage devices of the controllers. Thin provisioning is a type of virtualization technology to give the appearance of a volume (or a LUN unit) having more storage space than the actual available storage space. Thin provisioning is a technique for optimizing utilization of available storage space. It relies on on-demand allocation of blocks of data versus the traditional method of allocating all the blocks in advance in response to an allocation request. This methodology helps avoid poor space utilization rates that commonly occur in other storage allocation method where large pools of storage capacity are allocated to individual hosts but remain unused (i.e. not written to). With thin provisioning, storage capacity utilization efficiency can be automatically driven up towards 100% with little administrative overhead. The storage controller can first allocate relatively little storage capacity for the volume, and then later increase storage capacity in accordance with actual space usage from the hosts.
A thinly-provisioned volume (i.e. a storage volume that is thinly provisioned by a storage controller) is able to grow and also shrink its storage capacity when needed. A volume is a logical data set which is an abstraction of physical storage, combining one or more physical mass storage devices (e.g., disks) or parts thereof into a single logical storage object. A volume can be, for example, a logical unit identified by a logical unit number (LUN) in a SAN environment. Thin provisioning allows storage space to be easily allocated to hosts, on a just-enough and just-in-time basis.
However, when a storage controller assigns a volume to a host, the host can create its own file system on the volume and do its own file system bookkeeping. As a result, the host can have a very different idea of how much space it is currently using within the volume than what the storage controller has. FIG. 1 illustrates an example scenario of inconsistent views of storage space usage from a storage host and a storage controller. A file system of the storage host manages the storage volume, and the storage controller the data access requests for the storage volume. At step 1 of FIG. 1, the host writes two new files to the volume, each consuming 25% of the total storage space of the volume. Both the host and the storage controller show that 50% of the storage space has been used. At step 2, the host writes another new file of the same size. Again the host and the storage controller both show that 75% of the storage space has been used. At step 3, the host deletes the first and second files. For hosts with most file systems (e.g. New Technology File System, also referred to as NTFS), deleting a file causes the host to deallocate the blocks of the deleted files and record the references to these blocks in a free block data structure (e.g. volume free space map). However, in a SAN environment, there is no mechanism for the host to notify the storage controller of the deletion of the file. The data stored inside the volume is opaque to the storage controller. Hence, the views of the host and the storage system diverge at step 3. The host shows that the volume is only 25% full while the storage system shows that 75% of the volume is in use. The host is under no obligation to reuse the blocks it deallocated, so if the host writes another file to the volume, the fourth file may occupy previously unused space, as shown in step 4. Then the storage controller shows that the volume is full, while the host shows just 50% utilization of the volume.
These discrepancies in views between the host and the storage controller do not pose a serious problem in situations of volumes having fixed sizes. But for a thinly-provisioned volume, there is a big potential difference, from the perspective of the storage controller, between a volume that is considered to be 25% full and one that is considered to be 75% full. The host considers the extra 50% to be unallocated, however, the storage controller does not know that. Consequently, the storage controller would not be able to adjust the thinly-provisioned volume and assign the unallocated space for other purposes. Over a period of time, the storage controller tends to allocate more storage space for the host, while more files are deleted by the host without release from the storage controller's view. The benefits of thin provisioning, therefore, tend to disappear over time. Eventually all storage space of the volume is allocated, and the storage controller can no longer provide thin provisioning capability to the volumes of the hosts.