§1.1 Field of the Invention
The present invention concerns computer storage and file systems. More specifically, the present invention concerns techniques for managing and using a distributed storage system.
§1.2 Related Art
Data generated by, and for use by, computers is stored in file systems. The design of file systems has evolved in the last two decades, basically from a server-centric model (which can be thought of as a local file system), to a storage-centric model (which can be thought of as a networked file system).
Stand alone personal computers exemplify a server-centric model—storage has resided on the personal computer itself, initially using hard disk storage, and more recently, optical storage. As local area networks (“LANs”) became popular, networked computers could store and share data on a so-called file server on the LAN. Storage associated with a given file server is commonly referred to as server attached storage (“SAS”). Storage could be increased by adding disk space to a file server. Unfortunately, however, SASs are only expandable internally—there is no transparent data sharing between file servers. Further, with SASs, throughput is limited by the speed of the fixed number of busses internal to the file server. Accordingly, SASs also exemplify a server-centric model.
As networks became more common, and as network speed and reliability increased, network attached storage (“NAS”) has become popular. NASs are easy to install and each NAS, individually, is relatively easy to maintain. In a NAS, a file system on the server is accessible from a client via a network file system protocol like NFS or CIFS. Network file systems like NFS and CIFS are layered protocols that allow a client to request a particular file from a pre-designated server. The client's operating system translates a file access request to the NFS or DFS format and forwards it to the server. The server processes the request and in turn translates it to a local file system call that accesses the information on magnetic disks or other storage media. The disadvantage of this technology is that a file system cannot expand beyond the limits of single NAS machine. Consequently, administering and maintaining more than a few NAS units, and consequently more than a few file systems, becomes difficult. Thus, in this regard, NASs can be thought of as a server-centric file system model.
Storage area networks (SANs) (and clustered file systems) exemplify a storage-centric file system model. SANs provide a simple technology for managing a cluster or group of disk-storage units, effectively pooling such units. SANs use a front-end system, which can be a NAS or a traditional server. SANs are (i) easy to expand, (ii) permit centralized management and administration of the pool of disk storage units, and (iii) allow the pool of disk storage units to be shared among a set of front-end server system. Moreover, SANs enable various data protection/availability functions such as multi-unit mirroring with failover for example. Unfortunately, however, SANs are expensive. Although they permit space to be shared among front-end server systems, they don't permit multiple SANs environments to use the same file system. Thus, although SANs pool storage, they basically behave as a server-centric file system. That is, like a fancy (e.g., with advanced data protection and availability functions) disk drive on a system. Finally, various incompatible versions of SANs have emerged.
The article, T. E. Anderson et al., “Serverless Network File Systems,” Proc. 15th ACM Symposium on Operating System Principles, pp. 109-126 (1995) (hereafter referred to as “the Berkeley paper”) discusses a data-centric distributed file system. In the system, manager maps, which map a file to a manager for controlling the file, are globally managed and maintained. Unfortunately, the present inventors believe that maintaining and storing a map having every file could limit scalability of the system as the number of files become large.
§1.3 Unmet Needs
In view of the foregoing disadvantages of known storage technologies, such as the server-centric and storage-centric models described above, there is a need for a new storage technology that (i) permits storage capacity to be added easily (as is the case with NASs), (ii) that permits file systems to be expanded beyond a given unit (as is the case with SANs), (iii) that are easy to administer and manage, (iv) that permit data sharing, (v) and are able to perform effectively with very large storage capacity and client loads.