1. Field of the Invention
This invention relates generally to storing and modifying data file systems.
2. Description of the Related Art
Various computer systems are often coupled to one or more networks in a given environment. These computer systems may need to share data storage or computing power beyond each computer system's individual capacity. Thus, with the growing needs for additional computing power and storage by sharing the resources of each computer system, cluster technology is an increasing focus of research and development. One of these important resources is shared data storage.
A cluster may be constructed from a plurality of computer systems coupled to a network. The computer systems that are included in the cluster are referred to as nodes, and the network that couples the nodes is often termed a cluster interconnect. However, merely coupling a plurality of computers to a network does not constitute a cluster. Each node of the cluster must run clustering software to unify each node's otherwise independent operation. By the unification of otherwise independent operation, it is possible to increase computing power and increase amounts of data storage available.
Typically with computer systems, the storage and retrieval of data often involves using a file system associated with the operating system. A file system may include a collection of management structures which impose a logical and/or systematized structure upon a storage medium.
A cluster file system is a form of file system that may allow a plurality of nodes of a cluster to simultaneously access the same storage medium, such as a storage area network (SAN). Typically, one or more server nodes access the storage medium directly. Server nodes using a cluster file system may provide access to the storage medium as a shared data storage to client nodes. Each client node using the cluster file system may view the shared data storage as a local resource.
A cluster file system is dynamic in function and may include data structures in the shared data storage as well as in other memory mediums associated with the servers and clients. A data structure including user data may be considered a file (or regular file). The file may store the user data in a space of the shared data storage. Other data structures may organize internal data, referred to as metadata, of the cluster file system. Metadata may include information about files, file identity, allocated space, and/or de-allocated space of the cluster file system.
Often nodes include a local cache of the metadata or a local status of the metadata. Typically, one of the server nodes handles metadata updates, and is responsible for coordinating accesses by the various nodes such that integrity and accuracy of the metadata (including local caches of the metadata) and/or local statuses of the metadata are maintained. Since the nodes only share the shared data storage and not a local memory medium, a significant amount of communication between the nodes may take place to correctly synchronize updates to the user data, metadata, and/or local statuses of the metadata in the cluster file system. Such updates may be required when various cluster file system operations including creating files, allocating space to files, de-allocating space from files, and/or deleting files are performed.
For example, a software program may issue a request to create a file in the cluster file system. The creation of a file in the cluster file system may require communication and/or synchronization of metadata and/or metadata updates between the nodes. After the file is created, the software program may subsequently issue a command to store user data in the file. The cluster file system may respond by allocating a space in the shared data storage to the file. However, such space allocation by the cluster file system may require additional communication and/or synchronization of metadata and/or metadata updates between the nodes. After space has been allocated to the file, the software program may store the desired user data in the file. The multiple communication and synchronization operations of the cluster file system between nodes as described in this example may add undesirable latency and limit performance.
Likewise, a software program may overwrite a file by first truncating it while opening the file. Common examples are file editors that write a file after changes and compilers that write object files when generated. Thus, when the cluster file system opens the file, the space associated with the file is de-allocated. Similar to the foregoing, this de-allocation of space may require communication and/or synchronization between the various nodes. Following this, when the software program stores data in the file, the cluster file system allocates new space to the file, thus requiring additional communication and/or synchronization between the nodes. Undesirable latency may thus be introduced, and performance may be adversely affected.