The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Distributed File System (DFS) is a mechanism in the MICROSOFT WINDOWS® family of operating systems that provides a uniform naming convention and mapping for collections of servers, shares, directories, and files. As referred to herein, a share is a shared directory that is made accessible, or exported, by one entity in a network for other entities in the network. A directory may include zero or more directory entries, which directory entries may be subdirectories or files.
DFS organizes distributed file resources, such as exported shares and directories, into a logical hierarchy that is also referred to as a DFS logical name space. The DFS logical name space provides a mapping between a logical pathname and the physical location that actually stores the file or directory associated with that logical pathname. The same files and directories may be replicated and stored in multiple different and/or separate physical locations. The replicated files and directories are called replicas. The DFS mechanism in itself does not provide for synchronizing the different replicas of the same files and directories. Instead, a separate content synchronization mechanism, such as the File Replication Service (FRS) provided in the MICROSOFT WINDOWS® family of operating systems, must be used to synchronize the contents of the different replicas.
The DFS logical name space is maintained in a DFS root node. A DFS root is a local share that serves as the starting point and host to other shares. Any shared data resource, such as a file or a directory, can be published in the DFS root node as a share listed under the DFS root. The files and directories published in a DFS root node may be hosted by any network entity that is accessible by other network entities over a network. The entries in the DFS root node include, but are not limited to, directories from a local network host, directories exported by any network server or client, and directories accessible through any network client software, such as, for example, a Common Internet File System (CIFS) client and a Server Message Block (SMB) client.
The mapping of file servers in the DFS logical name space may be dynamically changed by the DFS server. For example, the logical path “\\DFSroot\Projects” may be mapped in the DFS logical name space to physical location “\\server1\ProjA”. The DFS server managing the DFS logical name space may then transparently change this physical location to “\\server2\ProjB”. This technique may be used in products that provide file virtualization that allows dynamic and frequent changes to the physical location of files between file servers, while at the same time keeping the DFS logical paths unchanged.
A particular file or directory published in a DFS root node as a logical share may be associated with one or more target replicas, where each of the replicas is a copy of the particular file or directory and is physically stored in a directory exported by a server. The DFS root node keeps information that associates the logical name of each replica with its corresponding physical location (e.g. a server and a directory). Since the DFS root node maps a logical representation into physical storage, the physical location of files and directories becomes transparent to users and applications.
For example, suppose that a client wishes to access a file that is located in a share or directory on the network. The client sends a request to the DFS root node to resolve the path to the file. The DFS root node analyzes the path requested by the client, and sends back a list of one or more DFS referrals. Each DFS referral in the list specifies the actual server name, share name and/or directory where the actual file or a replica of the file resides. The client then selects one of the servers specified in the DFS referrals, and proceeds to access the file.
The DFS mechanism has several disadvantages. One disadvantage is that DFS itself does not provide for synchronizing the content of replicated files and directories. DFS must rely on additional protocols and services to synchronize the replicas registered in a DFS root node. This may present serious data integrity and data consistency problems when heavily modified local replicas must be synchronized with heavily accessed remote replicas over low bandwidth and/or high latency communication links.
Another disadvantage is that an administrator must manually register each file or directory in the DFS root node. For example, in an organization of even a modest size that has multiple servers and that shares numerous files and directories among its users, the time and costs required for the manual registration these files and directories in a DFS root node may be significant. Further, since a DFS namespace is managed manually, adding and replacing servers as part of ongoing maintenance may result in stale entries in the DFS referral lists, in the new servers not being integrated properly, etc.
In another example, suppose that an organization has multiple geographically dispersed branch locations. Further suppose that the main branch of the organization has multiple master file servers, which store shared files and directories that are registered in one or more DFS root nodes. In order to provide the users at each branch with local access to the shared files and directories, the organization may want to add one or more new file servers in each of its remote branches to store replicas of the shared files and directories. In order to make use of DFS, each replica for each shared file or directory must be manually registered in a DFS root for each of the new file servers in each branch. However, the time and costs required for the manual registration of each replica in each DFS root node grow exponentially with the increase of the number of files, directories, replicas, branches, and file servers, and can quickly reach prohibitive levels. Moreover, in order to provide better sharing of information, an organization may want to introduce a “caching” solution, which provides for caching files at “caching” file servers in each branch. Integrating such caching solution with existing DFS for accelerating access to the file servers may be a key requirement by the organization. However, current approaches to integrating caching solutions with DFS require adding the “caching” file servers to the existing DFS roots manually, thus complicating the already complex task of introducing the caching solution in the first place.
Based on the foregoing, there is a clear need for a technique for providing transparent access to files and directories based on information in DFS root nodes without the need to manually register each and every replica of the files and directories in the DFS root nodes.