A Distributed File System (Dfs) is a network server component that locates and manages data on a network. Dfs may be used for uniting files on different computers into a single name space, thus, allowing a user to build a single, hierarchical view of multiple file servers and file server shares on a network. In the context of a server computer or set of server computers, Dfs can be likened to a file system for hard disks in a personal computer system. For instance, similar to the role of file systems for providing a uniform named access to collections of sectors on disks, Dfs may provide a uniform naming convention and mapping for collections of servers, shares, and files. Thus, Dfs may organize file servers and their shares into a logical hierarchy which enables a large enterprise to manage and use its information resources more efficiently.
Furthermore, Dfs is not limited to a single file protocol and can support the mapping of servers, shares, and files, regardless of the file client being used, provided that the client supports the native server and share. Dfs may also provide name transparency to disparate server volumes and shares. Through Dfs, an administrator can build a single hierarchical file system whose contents are distributed throughout an organization's wide area network (WAN).
In the past, with the Universal Naming Convention (UNC), a user or application was required to specify the physical server and share in order to access file information. For example, a user or application had to specify \\Server\Share\Path\Filename. Even though UNCs can be used directly, a UNC is typically mapped to a drive letter, such as x:, which, in turn, may be mapped to \\Server\Share. From that point, a user was required to navigate beyond the redirected drive mapping to the data he or she wishes to access. For example, copy x:\Path\More_path\. . . \Filename was required by the user to navigate to a particular file.
As networks grow in size and as enterprises begin to use existing storage—both internally and externally—for purposes such as intranets, the mapping of a single drive letter to individual shares scales rather poorly. Further, although users can use UNC names directly, these users can be overwhelmed by the number of places where data may be stored.
Dfs solves these problems by permitting the linking of servers and shares into a simpler and more easily navigable name space. A Dfs volume permits shares to be hierarchically connected to other shares. Since Dfs maps the physical storage into a logical representation, the net benefit is that the physical location of any number of files becomes transparent to users and applications.
Furthermore, as a network size grows to the level of a global network, several copies of the same file or files may be located in several different locations within the network to help reduce the costs (in terms of network time, network load, etc.) associated with retrieving a file from the network. For example, users of a large network located near a first server location will typically use a copy of a file on a server nearest to them(i.e., users in Seattle may be closest to a server named Redmond that is located near Seattle). Similarly, users of a large network located near a second server location will typically use a copy of a file on a different server nearest to them(i.e., users in Thailand may be closest to a server named Bangkok located in Bangkok). Thus, the site-cost (i.e., a scalar number which is an pseudo-arbitrary indication of a number of network parameters including the distance between client and server, the degrees of server separation, and other physical network parameters) of retrieving a file may be minimized by accessing the nearest server having the requested file or files.
When a user wishes to retrieve a file from a Dfs, the client computer from which the user is requesting the file determines how to go about retrieving the requested file. A client computer may issue a referral request to obtain one or more locations for the requested file or files. A referral may be a relative path between the requesting client computer and a server computer in which the requested file or files may be found. A client computer may request the files or files known to be unavailable locally and a determination may be made as to how many different locations may provide a copy of the requested file. Typically, there may be hundreds or even thousands of targets (i.e., the relative path to the file) indicating locations that may provide the requested file. As such, a referral response, which is returned to the client computers in response to the referral request, typically includes a list of targets corresponding to servers and/or shares having the requested file.
In the past, however, the referral response returned to the client computer may have the targets identified listed in a random order or, in some cases, by site-cost. Each target in the referral response did not necessarily bear any relationship to a target that immediately preceded it or immediately followed it. As a result, the client computer may have simply started at the top of the randomly-ordered list of targets and attempt to establish a connection with each successive target on the list until one responded with connectivity.
A problem with this randomness, however, is the fact that the first available target may, in fact, be literally located on the other side of the world. Thus, the site-cost of communicating with this first-available target may be rather high and undesirable in the long-term.
However, preserving continuity of a connection to a target is somewhat important. This is known as “sticking” or “stickiness.” Thus, once the first-available target is located that is able to fulfill the file request of the client computer, typically, all future referrals and requests are also routed to that target unless the user of the client computer specifically requests a new referral. Therefore, the possibly high site-cost connection to the first-available target may remain indefinitely causing all the more network traffic and general overall network cost.
The problem of maintaining inefficient referrals between a client computer and a server computer to preserve continuity may result in high site-cost communication sessions. What is needed is a way for preserving continuity of referral connections while reducing site-cost for the referral connection.