The invention pertains to the field of data storage.
In the field of data storage, so-called “distributed” file systems are known in which storage resources on file servers are made available to remote hosts computers via a communications network such as a TCP/IP network. Well known examples of such file systems include Distributed File System (DFS) and Network File System (NFS). Storage resources in the form of volumes or directories are made available to host computers by a “mount” operation that creates an association between a local identifier (such as the well known letter identifiers of Windows® operating systems) and a network identifier (typically a pairing of a host name and a volume or directory name). Host references to the local identifier are forwarded to a local client component of the distributed file system, which engages in a specialized protocol with the file server to perform the requested storage operation on the remote storage resource.
More recently, “global” distributed file systems have been created which are characterized by a single “namespace” for all storage resources in a network. In traditional distributed file systems, it is necessary to identify the host computer on which the storage resource resides, and thus the namespace includes both host identifiers as well as volume/directory/file identifiers. From the perspective of many applications programs that utilize network storage, the need to identify a physical host computer as part of accessing data may be undesirable. Global file systems address this drawback of traditional distributed file systems by creating a single namespace for all resources in a given network, so that it is unnecessary to explicitly identify host computers as part of accessing storage data.
Global file systems have suffered performance issues arising from centralized management of so-called “metadata”, which is the data that identifies where all the user data is stored in the network and the mapping of the single namespace to identifiers of host computers and storage devices on which the user data is stored. Thus a further recent development in the field has been a “segmented” distributed file system in which the metadata itself is managed in a distributed rather than a centralized fashion. In one commercially available segmented file system, the totality of a single virtual storage space is divided into numbered segments, in a manner somewhat akin to the use of area codes in the telephone system. Also like the telephone system, storage requests are routed among servers based on locally stored subsets of the metadata. Thus if a request is directed to segment 205, for example, and first encounters a server that is responsible for segments 100-199, that server consults its local metadata to identify another server to which the request should be routed. The other server may be the server responsible for the requested segment, or it may be a server that is along a path to the responsible server.
Distributed file systems in general provide relatively good scaling as demand for storage resources in a network grows. However, the management of the namespace and access to the metadata can become difficult in larger networks. Segmented file systems provide even better scaling by their use of a single namespace and distributed metadata management. Storage resources can easily be added to a system in any of various forms, such as high-end integrated cached disk arrays (ICDAs) to mid-range systems to low-end systems having only a few disk drives and fitting within a single shelf of an equipment rack, by relatively simple configuration operations.
There is also a movement within the field of data storage toward so-called information lifecycle management or ILM, which involves classifying data according to its use and then assigning the data to one of different types of storage devices to achieve a desired cost/performance objective. As an example, an ILM system may classify data according to how frequently it is accessed, how delay-sensitive the associated application is, the need for protection in form of redundancy, etc. The classification of data can change over its lifetime, and as its classification changes the ILM should automatically move the data to a more appropriate form of storage. Data that has been recently created and that is in high demand, for example, may be deployed on a relatively expensive high-performance storage system, whereas the same data at a later time experiencing much less frequent access may be moved to a mid-range or even low-end storage system. Archival storage can be employed to store data that has reached the end of its production lifetime. ILM systems allow system users to create various policies for how data is to be treated based on its classification.