The invention relates generally to computer system storage and more particularly to mechanisms (methods and devices) for providing distributed computer system storage having proxy backup/stand-in capability.
It is common for organizations to employ large numbers of computers for tasks such as data storage. Typically, some or all of an organization""s computers may be interconnected to form a network whereby two or more computer systems are interconnected so that they are capable of exchanging information. With the adoption of computer network technology came the desire for increased storage capacity. Increased storage capacity, in turn, led to a need to distribute file systems across networked computers. In general, distribution of file systems is done by software applications that keep track of files stored across a network. One goal of distributing file systems is to allow a user/application of one computer (or node) in a computer network to access data or an application stored on another node in the computer network. Another goal of distributing file systems is to make this access transparent with respect to the stored object""s physical location.
FIG. 1 shows a computer system employing distributed file system technology in accordance with the prior art. As shown, node-A 100 and node-B 102 are interconnected by communication link 104. Illustrative nodes include specialized or general purpose workstations and personal computers. An illustrative communication link employs coaxial or twisted pair cable and the transport control protocol (TCP). Each node A and B executes a local version of a distributed file system, 106 and 108 respectively. Each distributed file system manages the storage of objects to/from a storage unit (e.g., 110 and 112), each of which may include one or more storage devices. Illustrative storage devices include magnetic disks (fixed, floppy, and removable) and optical media such as CD-ROM disks.
One well known distributed file system is the Network File System (NFS(copyright)) from Sun Microsystems, Incorporated of Palo Alto, Calif. In NFS, a server node may make its file system (in part or in whole) shareable through a process known as xe2x80x9cexporting.xe2x80x9d A client node may gain access to an exported file system through a process known as xe2x80x9cmounting.xe2x80x9d Exporting entails specifying those file systems, or parts thereof, that are to be made available to other nodes (typically through NFS map files). Mounting adds exported file systems to the file structure of a client node at a specified location. Together, the processes of exporting and importing define the file system namespace.
For example, consider FIG. 2 in which node 200 has local file system 202 including directories X, Y, and Z, and node 204 has local file system 206 including directories xcex1, xcex2, and xcex3. If node 204 exports, and node 200 imports file system 206 (often referred to as cross-mounting), node 200 may have combined system namespace 208. From directory structure 208, a user/application on node 200 may access any data object in remote directories xcex1, xcex2, and xcex3as if xcex1, xcex2, and xcex3 were local directories such as X or Y.
One significant feature of distributed storage such as that illustrated in FIG. 2, is that all references to an object stored in directory xcex1 by a user at node 200 (i.e., through combined file system namespace 208) are resolved by the file system local to and executing on node 204. That is, the translation of an object""s reference to the physical location of that object is performed by the file system executing on node 204. Another significant feature of current distributed file systems such as NFS(copyright) is that the processes of exporting and importing must be performed for each new directory to be shared. Yet another significant feature of current distributed file systems is that shared storage (e.g., mount points xcex1, xcex2, and xcex3) appear as discrete volumes or nodes in file system namespace. In other words, an exported file system (or part thereof) appears as a discrete objects of storage in the namespace of each importing node. Thus, system namespace is fragmented across multiple storage nodes. To export a single directory from a node to all other nodes in a computer network, not only must the exporting node""s map of objects (or its equivalent) be updated to specify the directory being exported, but every node wanting to import that directory must have its map of objects updated. This may happen frequently as, for example, when additional storage is added via a new storage node being attached to the network, and requires significant administrative overhead for each such occurrence.
Thus, it would be beneficial to provide a distributed storage mechanism that reduces administrative overhead associated with sharing memory and unifies the shared system namespace.
In one embodiment the invention provides a method to manage storage of an object in a computer system having a first and a second storage management process, wherein the stored object includes a data portion, a metadata portion and a fault tolerance data portion. The method includes receiving a memory access request from a client process, routing the memory access request to the first storage management process, determining the first storage management process has failed, routing the memory access request to the second storage management process (the second storage management process having access to the fault tolerance data portion), receiving a result from the second storage management process, and returning at least a potion of the result to the client process. The method may also include reconstructing at least a portion of the metadata portion, identifying the fault tolerance data portion based on the reconstructed portion of the metadata portion, modifying the fault tolerance data portion in accordance with the memory access request, and storing the modified fault tolerance data. Additionally, a record (journal) of the changes made to the fault tolerance data portion may be maintained by the second storage management process and transmitted to the first storage management process when it be comes operational. Methods in accordance with the invention may be stored in any media that is readable and executable by a computer system.