1. Field of the Invention
The present invention relates to computer systems and, particularly, to a system and method for providing file sharing in a computer file system to allow for efficient multi-user access.
2. Description of Related Art
Typically, the operating system of a computer system includes a file system to provide users with an interface while working with data on the computer system's disk and to provide the shared use of files by several users and processes. Generally, the term “file system” encompasses the totality of all files on the disk and the sets of data structures used to manage files, such as, for example, file directories, file descriptors, free and used disk space allocation tables, and the like. Accordingly, end users generally regard the computer file system as being composed of files and a number of directories. Each file usually stores data and is associated with a symbolic name. Each directory may contain subdirectories, files or both. The files and directories are typically stored on a disk or similar storage device.
Operating systems such as UNIX, Linux and Microsoft Windows manage computer file systems by defining a file object hierarchy. A file object hierarchy begins with a root directory and goes down the file tree. The file address is then described as an access path, e.g., a succession of directories and subdirectories leading to the file. This process of assigning a file address is called access path analysis or path traverse. For instance, the path “/r/a/b/file” contains the root directory (/), subdirectories “r”, “a” and “b” and then the file. Typically, the processes within an operating system interact with the file system with a regular set of functions. For example, these functions usually include open, close, write and other system calls. For instance, a file may be opened by the open functions and this function acquires the file name as a target.
The file system may also include intermediate data structures containing data associated with the file system to facilitate file access. This data is called metadata and may include, for example, data corresponding to the memory location of the files, e.g., where the file is located in the hard drive or other storage medium. For example, in the context of a UNIX operating system, these intermediate data structures are called “inodes,” i.e., index-node. An inode is a data structure that contains information about files in UNIX file systems. Each file has an inode and is identified by an inode number (e.g., i-number) in the file system where it resides. The inodes provide important information on files such as user and group ownership, access mode (read, write, execute permissions) and type. The inodes are created when a file system is created. There are a set number of inodes, which corresponds to the maximum number of files the system can hold.
Usually, computer file systems store this intermediate data concerning the location of stored files as separate structures in the same place as the file content is stored. The functions responsible for file searching, implemented in the operating system kernel, for example, first locate the intermediate data and then locate the file data that is sought. Directories may also have intermediate data structures containing metadata. File systems may also generate intermediate file data “on the fly” at the moment when the file system is requesting the file, for example. For instance, the NFS Network file system used by Sun Microsystems of Santa Clara, Calif., provides for on the fly intermediate data creation.
In addition, intermediate data structures may include reference files or links that are associated with or point to other files. When a link is accessed, the link itself is not opened. Instead, only the file to which the link refers is opened. Thus, the intermediate data structure in a link may contain data referring to other files that are not requested. For instance, the intermediate data structure may contain the path to another file that will be found and opened instead of this reference link. There are several types of links or references. For example, references that include a symbolic name of another file are called symbolic links. References that refer to another file's intermediate structure are called hard links. The type of link used is generally determined by the operating modes supported by the operating system.
File systems may provide several functions. As discussed above, the most basic task of a file system is provide access to files. File systems may also enhance system performance with additional functions such as, for example, caching, access markers and fault-tolerance.
The multi-user operating mode of a computer system may generally allow the operating system processes of different users to operate simultaneously. Each process within the operating system is usually associated with information that identifies the user. For instance, in a UNIX system, this information is typically an identifier of the user and group on whose behalf this process is being executed. When accessing a file, the operating system defines the user requesting the file operation and determines whether the operation is permitted for that user. Generally, this determination may be made upon opening the file, e.g., requesting a function of the type “open.” Thus, on the basis of this access information, the operating system may organize different views of the same file system tree based upon selected parameters, such as, for example, time, operation type or user information.
To unite different types of computer file systems, these file systems may be mounted. For any directory inside the file system, it is possible to mount another file system into that existing directory. Thus, one tree of the computer file system appears inside another file tree. The operating system uses a specific system call of the operating system kernel to mount a file system. This system call includes at least two arguments: the mounting point, e.g., the directory inside of the current file system, and the file system itself, e.g., the storage device or memory location where the data resides. Depending on the file system, additional information containing parameters of the specific file system types may be included. During analysis of the access path to the selected data file, the operating system defines a moment when the path “passes” through this mounting point and “below” this point, an analysis is performed using operations specific for the given file system. The set of these operations is defined according to the parameters established during the file mounting process.
The UnionFS file system, developed in the FreeBSD UNIX operating system, implements a similar technique. One feature of UnionFS is that each user can have a different view of the tree of the same file. In order to provide this feature, two trees of the file system are built when mounting UnionFS. The first tree is a read-only tree. The second tree is built in during the user's session and is used for auxiliary purposes. This second tree is defined as an additional parameter when mounting.
When calling a file within the shareable tree, a search is performed in two ways. First, the search may be based on a path name that is computed based on the location of the file. For example, the mounting point of UnionFS may be located at “a/b/u,” and the file to be addressed may be at “/a/b/u/c/d/e.” The second tree, mounted to the same point, is located starting from the address “/x/y/.” Then an additional address is computed as “/a/b/u/c/d/e” minus “/a/b/u” plus “/x/y/.” As a result, the additional address is computed as “/x/y/c/d/e.”
Thus, the specific intermediate data structure (e.g., inode) is searched using the computed path name. If the specific intermediate data structure (inode) is found, then it is assumed that the file is found and the requested operation will be performed on this file. If the file is not found, then a second search by the direct address is provided. If the file is not found there either, the system returns the corresponding error code. Otherwise, the system acts according to the requested operation. If the file opens in response to an operation to modify its content or associated data, then the file is first copied to the computed address as described above, and the operation is performed on the new copy. Otherwise, the operation is performed on the file located in the shareable tree by the requested address.
One way to change the search address of the file object, and, accordingly, the position of the root file system for a group of processes, is to use a primitive that is analogous to the OS UNIX kernel primitive “chroot.” The operation of this primitive is based on the principle of shifting the real root of the file system or “root” directory to some location for a selected group of processes, for instance, for all processes of one user. Then all file operations inside this process kernel are performed only within the sub-tree of the selected file system.
Another example of this type of system is one based upon “snaps” of the file system, or tree snapshots, in which modifications to the entire file system are chronologically arranged. All modifications made in the file system or any of its parts during a period of time are saved in a separate tree of the file system. Such separate chronologically arranged trees represent the complete history of file system modifications for a discrete period of time. Thus, to determine the file state at a fixed moment of time, the operator searches for the file in the most recently accessed file tree. If the file is not located, then the previous tree is searched.
Similarly, the Mirage File System (MFS) from IBM of Armonk, N.Y., describes a system consisting of a number of trees and a specific file search mechanism that depends on the file type, extension and sequence of requests, among other parameters. One of the principles of this computer file system is the substitution of the file search path whereby the search path is expanded to other file system locations associated with the file object being searched. For example, this system offers an implementation of a system of snapshots.
U.S. Pat. No. 6,289,356 also describes an example of implementation of specific intermediate data structures, in which the file system is organized with a strictly regulated mode of modifications records. The disclosed system provides the transition of file system states so that, at any moment of time, the system is in the correct state. Additionally, the system creates snapshots of the file system through doubling an intermediate data structure (e.g., inode) without doubling the files themselves. The system also marks the files chosen to store data file blocks as belonging to some snapshot of the file system. This provides interference with file system functioning at the level of data distribution algorithms.
A robust file system is especially important in multi-user systems, such as, for example, virtual server systems. A virtual server is a server, for example, a Web server, that shares computer resources with other virtual servers. In this context, the term virtual indicates that the virtual server is not a dedicated server—the entire computer is not dedicated to running the server software. Virtual computer systems have several applications. For example, virtual web servers are a very popular way of providing low-cost web hosting services. Instead of requiring a separate computer for each server, dozens of virtual servers can co-reside on the same computer. In most cases, performance is not affected and each web site behaves as if it is being served by a dedicated server. However, if too many virtual servers reside on the same computer, or if one virtual server starts utilizing an excessive amount of resources, applications such as Web pages, for example, will be delivered more slowly.
In addition to maintaining efficient allocation of resources, providing multi-user access involves other considerations as well, including security, avoiding file corruption and maximizing system efficiency. Accordingly, it is desirable to provide a file system that provides multi-user access but avoids the danger of file corruption, provides security, allows scalability and facilitates the efficient use of limited system resources.