A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored. As used herein a file is defined to be any logical storage container that contains a fixed or variable amount of data storage space, and that may be allocated storage out of a larger pool of available data storage space. As such, the term file, as used herein and unless the context otherwise dictates, can also mean a container, object or any other storage entity that does not correspond directly to a set of fixed data storage devices. A file system is, generally, a computer system for managing such files, including the allocation of fixed storage space to store files on a temporary or permanent basis.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the storage system. Sharing of files is a hallmark of a NAS system, which is enabled because of its semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the filer. The clients typically communicate with the storage system by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the storage system by issuing file system protocol messages (in the form of packets) to the file system over the network identifying one or more files to be accessed without regard to specific locations, e.g., blocks, in which the data are stored on disk. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the storage system may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethernet.
A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of information storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. In some SAN deployments, the information is organized in the form of databases, while in others a file-based organization is employed. Where the information is organized as files, the client requesting the information maintains file mappings and manages file semantics, while its requests (and server responses) address the information in terms of block addressing on disk using, e.g., a logical unit number (LUN).
In some SAN environments, storage systems may export virtual disks (vdisks) to clients utilizing block-based protocols, such as, for example, Fibre Channel and iSCSI. One example of a vdisk is a special file type in a volume that derives from a plain file, but that has associated export controls and operation restrictions that support emulation of a disk. Vdisks are described further in U.S. Pat. No. 7,107,385, entitled STORAGE VIRTUALIZATION BY LAYERING VIRTUAL DISK OBJECTS ON A FILE SYSTEM, by Vijayan Rajan, et al., the contents of which are hereby incorporated by reference. These block-based protocols and the exported file/vdisks appear as physical disk devices to the clients of the storage system.
Certain file systems, including the exemplary write anywhere file layout (WAFL) file system available from Network Appliance, Inc, of Sunnyvale, Calif., include the capability to generate a thinly provisioned data container, wherein the data container is not completely written to disk at the time of its creation. As used herein, the term data container generally refers to a unit of storage for holding data, such as a file system, disk file, volume or a logical number (LUN), which is addressable by, e.g., its own unique identification. The storage space required to hold the data contents of the thinly provisioned data container on disk has not yet been used. The use of thinly provisioned data container is often utilized in the exemplary WAFL file system environment when, for example, a vdisk is initially generated. A user or administrator may generate a vdisk of specified size, for example, 10 gigabytes (GB). This size represents the maximum addressable space of the vdisk. To increase system performance, the file system generally does not write the entire vdisk to the disks at the time of creation. Instead, the file system generates a thinly provisioned data container (i.e., file) representing the vdisk. The thinly provisioned data container may then be populated (filled in) via subsequent write operations as the vdisk is filled in with data. While this description is written in terms of a thinly provisioned data container over and underlying file system, it should be noted that other thin provisioning implementations may be utilized. As such, the use of an underlying file system to support a thinly provisioned data container should be taken as exemplary only.
FIG. 1 is a schematic block diagram of an (inode structure) buffer tree 100 of an exemplary thinly provisioned data container. This (inode) buffer tree structure 100 is created when, for example, a vdisk is first created by the file system as thinly provisioned. In a typical thinly provisioned data container, only the inode 105 is actually written to disk. The remainder of the data container is not written to or otherwise physically stored on the disks storing the data container. The data container 100 includes a completed inode 105, however, it does not contain indirect blocks 110, 120 or file data blocks 125 (as shown in phantom). Thus, these phantom blocks (i.e., 110, 120, 125) are not generated when the data container is created, although, they will be written to disk as the data container is populated. By only writing the inode to disk when a thinly provisioned data container is generated, substantial time is saved as the number of disk accesses is reduced. Additionally, only the storage space on the disks that is needed to hold the contents of the data container are utilized. Illustratively, the file system will make appropriate space reservations to ensure that the entire thinly provisioned data container may be written to disk. Space reservation techniques are described in U.S. patent application Ser. No. 10/423,391, entitled SYSTEM AND METHOD FOR RESERVING SPACE TO GUARANTEE FILE WRITABILITY IN A FILE SYSTEM SUPPORTING PERSISTENT CONSISTENCY POINT IMAGES, by Peter F. Corbett, And now issued as U.S. Pat. No. 7,577,692, on Aug. 18, 2009.
FIG. 2 is a schematic block diagram of an exemplary (inode) buffer tree structure 200 of a partially filled in thinly provisioned data container that includes original inode 105. Here, indirect blocks 210, 220 and exemplary file data block 225 have been populated (filled in) in response to one or more write operations to the data container. Continued write operations will result in filling in additional data blocks, for example, file data block 325 as shown in the exemplary (inode) buffer tree structure 300 of FIG. 3. Eventually, when the data container has been completely filled, all blocks, including such blocks as indirect blocks 420 and associated file data blocks (not shown) will be completed as illustrated in the schematic block diagram of an exemplary inode structure 400 in FIG. 4. At such time, the thinly provisioned data container has been completely filled in and each block is associated with an actual block on disk.
A known environment for utilizing a storage system with a thinly provisioned data container, i.e., a thinly provisioned LUN, involves overlaying a host-side file system onto the thinly provisioned LUN. In such an environment, the host (or client of the storage system) includes a file system that utilizes the exported LUN as storage and maintains structured storage, e.g., a file system, on the blocks of the LUN. However, a noted disadvantage is that the host-side file system does not communicate status to the storage system concerning the deletion or deallocation of blocks within the LUN. Although the file system typically records appropriate metadata entries when a file is deleted, no status message is passed to the storage system that notifies the system that certain blocks of the LUN are no longer in use. Thus, while the LUN may dynamically grow by allocating additional blocks (up to its maximum number of addressable blocks) as needed, it will not deallocate blocks as files are deleted in the host-side file system. For example, if a LUN is generated with a maximum size of 100 GB and then a 50 GB file is written to it, the LUN will allocate 50 GB of space on the storage system. If the 50 GB file is thereafter deleted in the host-side file system, that file system records appropriate metadata entries and frees its file system pointers. However, the LUN will still occupy 50 GB of space on the storage system, even though the 50 GB is now unused space within the LUN.