The present invention relates to file systems and, more specifically, to a system and method for representing named data streams within an on-disk structure of a file system.
A network storage appliance is a special-purpose computer that provides file service relating to the organization of information on storage devices, such as disks. The network storage appliance or filer includes an operating system that implements a file, in system to logically organize the information as a hierarchical structure of directories and files on the disks. Each xe2x80x9con-diskxe2x80x9d file may be implemented as set of disk blocks configured to store information, such as text, whereas the directory may be implemented as a specially formatted file in which information about other files and directories are stored. An example of a file system that is configured to operate on a filer is the Write Anywhere File Layout (WAFL(trademark)) file system available from Network Appliance, Inc., Santa Clara, Calif.
Th Broadly stated, the on-disk format representation of the WAFL file system is block-based using, e.g., 4 kilobyte (KB) blocks and using inodes to describe the files. An inode is a data structure used to store information, such as meta-data, about a file. That is, the information contained in an inode may include, e.g., ownership of the file, access permission for the file, size of the file, file type and location of the data for the file on disk. The WAFL file system uses a file handle, i.e., an identifier that includes an inode number, to retrieve an inode from disk. The WAFL file system also uses files to store meta-data describing the layout of its file system. These meta-data files include, among others, an inode file. The on-disk format structure of the WAFL file system, including the inodes and inode file, is disclosed and described in U.S. Pat. No. 5,819,292 titled Method for Maintaining Consistent States of a File System and for Creating User-Accessible Read-Only Copies of a File System by David Hitz et al., issued on Oct. 6, 1998 and assigned to the assignee of the present invention.
A file system designed for use with the Windows NT operating system is the NT(trademark) file system (NTFS) available from Microsoft Corporation, Redmond, Wash. In NTFS, each unit of information associated with a file, including its name, its owner, its time stamps and its data contents, is implemented as a file attribute. Both files and directories have attributes, wherein each attribute consists of a single stream or sequence of bytes. This implementation facilitates the addition of more attributes, including data content attributes, to a file. Therefore, NTFS files and directories may contain multiple data streams. An NTFS file has one default data stream, $DATA, through which the file data is accessed, i.e., read and written; a directory, however, generally does not have a default data stream. Notably, an application can create additional named data streams for files and directories, and access them by referring to their names. The NTFS file system and multiple data streams are well known and described in Inside the Windows NT File System by Helen Custer, Microsoft Press, 1994.
A filer may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access files stored on a server, e.g., the filer. In this model, the client may comprise an application, such as a file system protocol, executing on a computer that xe2x80x9cconnectsxe2x80x9d to the filer over a computer network, such as a point-to-point link or a shared local area network. Each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the filer over the network. By supporting a plurality of file system protocols, such as the conventional Common Internet File System. (CIFS) protocol for the Microsoft(copyright) Windows(trademark) operating system, the utility of the filer may be enhanced for networking clients. File systems available from Microsoft Corporation and Apple Computer Inc. provide support (xe2x80x9crepresentationxe2x80x9d) for the multiple named data streams feature of the NTFS file system; the present invention is generally directed to providing support for that feature within the WAFL file system.
Therefore, an object of the present invention is to provide a network storage appliance configured to represent and support multiple data streams.
Another object of the present invention is to provide an operating system of a filer that enables client applications to create, access and modify files stored on the filer by issuing requests directed to named data streams.
Yet another object of the present invention is to provide a file system capable of creating, accessing and modifying data stored on a filer in response to file system protocol packets embodying multiple named data stream requests.
The present invention comprises a technique for providing on-disk representations of multiple named data streams for a file system of a network storage appliance. In the illustrative embodiment, the network storage appliance or filer includes a file system that implements a Write Anywhere File Layout (WAFL) disk format, wherein files are described by inodes of which there may be various types, including directory and regular inodes. According to an aspect of the invention, a novel stream inode type is defined that represents named data streams in the WAFL file system. That is, multiple named data streams may be stored on disk(s) of the filer as representations embodying the stream inode type associated with a file. Each on-disk file may have a default data stream along with at least one named data stream representation.
Specifically, each stream inode has its own size, file share locks, byte range locks and data blocks. However, file attributes, such as time stamps, group and user ownership information, and access control lists are common for all named data streams and are stored in an on-disk base inode. The default data stream, along with its size, data blocks, file share locks and byte range locks, is also stored in the base inode. Additionally, the names and file handles of the data streams are stored in a xe2x80x9chiddenxe2x80x9d directory within the file system that is referenced by the base inode. According to another aspect of the invention, the hidden directory is represented as a novel stream_dir inode type. The hidden directory is xe2x80x9cinvisiblexe2x80x9d in a directory hierarchy that is viewed by a user (e.g., a client) external to the file system and, thus, is inaccessible through an external file system protocol, such as the Common Internet File System protocol.
Operations that can be applied to a named data stream include (i) create, (ii) defete, (iii) read and (iv) write operations. Broadly stated, in the case of a create operation, the WAFL file system xe2x80x9copensxe2x80x9d a base inode associated with the default named stream, if it exists. Otherwise, the file system creates the base inode by allocating a free inode and inserting the name of the base inode into an entry of a parent directory. Next, the hidden directory is created (if it does not exist) and the base inode is configured to reference that directory. Thereafter, the file system allocates another free inode and inserts the name of the stream into the hidden directory, thereby creating the named stream. Each created named stream, or the entire inode with the hidden directory and its named streams, can be deleted in accordance with a delete operation.
Read access to a named stream in accordance with a read operation involves loading (from disk) the stream inode and its base inode into a memory of the filer. The base amode is loaded to update an access time stamp stored therein because, according to another aspect of the invention, time stamp attributes for named streams are represented by their base inodes. Data blocks associated with the base inode do not need to be retrieved from disk., However, the relevant data blocks associated with the named stream must be read from disk if they are not present in the memory so that the data contained therein may be retrieved and delivered to, e.g., the client issuing the read operation.
Similarly, write access to a named stream in accordance with a write operation involves loading the stream inode and its base inode. Here, the base inode is loaded into the memory to update a modification time stamp stored therein for the named stream. Data blocks that store the named stream data on disks are preferably referenced by the stream inode; those data blocks are updated (modified) as instructed by the write operation and the modified data are then written to disk. In addition, attributes of the named stream may be modified to reflect, e.g., a change in the size of the stream. For example, size information contained in the stream inode .is updated if the write operation results in extending (xe2x80x9cgrowingxe2x80x9d) of the file.
An access control list (security) of the base inode can be modified. Also, the name and size of each named stream can be listed. The base inode and the hidden directory (stream directory) are loaded from disk if they are not already present in memory, and the hidden directory is traversed to obtain the name of each named stream. Also each named stream inode is read from disk if it is not present in memory to retrieve its size. The name and size of each named stream may then be returned to the client.
Advantageously, the present invention allows an on-disk inode to have multiple named data streams representations, whereby the data blocks for each named stream can be retrieved independent from the data blocks of other streams. By storing different portions of data in different named streams, a client need only access those named stream data blocks of immediate and/or particular interest. Support for this multiple named stream feature reduces the amount of data that the WAFL file system needs to fetch from disk, thus increasing the efficiency of its data accesses and, notably, its file service provided to the client.