Conventionally, files stored in a given file server must be retrieved from the same file server. In a massively scalable system with a very large number of file servers, whenever a given file server runs out of space or runs out of processing resources, a portion of the file data and metadata must be explicitly migrated to another file server and the remote nodes must be explicitly reconfigured to observe this change.
Looking first at FIG. 1, a conventional implementation of Network Attached Storage (NAS) 100 is illustrated. In NAS 100, network protocols such as, without limitation, a Network File System (NFS) client 102, a Common Internet File System (CIFS) client 104, a Hypertext Transfer Protocol (HTTP) client 106, and a File Transfer Protocol (FTP) client 108 are connected through an access network 110 to a plurality of file servers 112a, 112b, and 112c. Each file server 112 is connected to a dedicated storage array 114, and each storage array 114 services a dedicated disk 116. That is, file server 112a is connected to a storage array 14a, which in turn is connected to a disk 116a. In an alternate embodiment, a network administrator may reconfigure the network such that file server 112a is connected to storage array 114b, file server 112b is connected to storage array 114c, and file server 112c is connected to storage array 114a. The characteristic of this architecture is that the reconfiguration of the network requires the intervention of that network administrator.
Looking now at FIG. 2, a conventional Storage-Area Network (SAN) 200 is illustrated. In SAN 200, network protocols such as, without limitation, a Network File System (NFS) client 202, a Common Internet File System (CIFS) client 204, a Hypertext Transfer Protocol (HTTP) client 206, and a File Transfer Protocol (FTP) client 208 are connected through an access network 210 to a plurality of file servers 212a, 212b, and 212c. Each file server 212 communicates with a storage array using a block level protocol, and each file server 212 is assigned to one or more disk volumes 216. For example and without limitation, file server 212a can be assigned to a disk volume 216a1, file server 212b can be assigned to disk volumes 216a2 and 216c1 file server 212c can be assigned to all of 216b, and disk volume 216c2 can be an unassigned, spare disk volume available for later assignment. Although the file servers of a SAN can be fully connected to all the disk volumes, that is a file server could access any disk volume on the storage-area network, the file server can use a disk volume assigned to this file server and must not directly use disk volumes assigned to other file servers. The characteristic of this architecture is that the disk resources are assigned logically to a file server rather than physically. However, once resources are assigned, another file server cannot use those resources until a formal reassignment occurs. No effort has been made to extend the conventional approach to file servers, dedicated “filers” and hierarchical mass storage systems in a manner that is distinctively different from existing cluster-based file storage solutions exploiting Storage Area Networks (SAN).
In these traditional approaches to a file storage system built of multiple file servers, each file server “owns” a part of a global file system (i.e., a part of the file system namespace and metadata of all the files belonging to this part of the namespace). Thus, a file stored on a given file server can be accessed later only through this particular file server. Although in the case of hierarchical storage systems, the file servers may share a physical file data repository (e.g. tape or optical disk jukebox), a file can be accessed (in a read-write mode) only through a file server that keeps the file's entry in the file system namespace and metadata (file attributes).
SAN-based cluster file systems on the other hand, may enable sharing of block-oriented devices between cluster nodes. However, this functionality depends on specific support built into the storage devices, such as SCSI locks, etc. Thus, a SAN-based cluster file system solution is limited because of its dependency on the additional functionality being built into the storage device.