The present invention relates generally to improving availability of information and resources despite network impairments of networks or servers. Many organizations have multiple offices or locations, and multiple projects active simultaneously. Collections of servers interconnected by data networks allow distributed organizations to support multiple distinct but cooperating locations, sharing their project information via these servers. In a networked file system, for example, files used by applications in one location might be stored in another location. Similar problems exist for other kinds of servers and services, such as e-mail, computation, multimedia, video conferencing, database querying, and office collaboration, in which the servers may be handling data such as web pages, text, database tables, images, video, audio, dynamic computations, applications, and services.
In a multi-location organization, a common arrangement is for each project to be assigned to a single location. However, such an assignment does not mean that the project is worked on only by people in that location. Rather, some persons working in other locations are also expected to contribute to that project. Typically, this arrangement is implemented by maintaining a file server at each location. Each location's file server contains the files related to every project assigned to that location. In general, any file or group of files can have a logical “home” in a single location, meaning that the file or group of files is stored at that location's file server. In addition, it is common to have a system such as Microsoft DFS, which enables a mapping from a logical name for a group of files to a server or group of servers storing that group of files. Additionally, file sharing systems enable users at a given location to access files stored by file servers at other locations.
With ideal, well-behaved communication networks and file servers, users at each location can contribute to the organization's work on any project. Each user can access their local file server or a file server at a remote location at any time, to read, write, or update files. In this ideal arrangement, there only needs to be a single copy of each file that is read or written by the various users sharing information via that file. The current state of each file is completely and accurately represented by the information in the single file copy.
Unfortunately, the real communication networks interconnecting locations are often less reliable, more expensive, and/or have less bandwidth than the local-area networks connecting users to their local file servers. In addition, the reliability and availability of each location's file server may vary greatly. For example, some locations may have unreliable power or network connections. Another example is that in an organization that is globally distributed, downtime required for preventive maintenance in the local time zone may correspond to prime working hours in a remote location. These network and server problems are referred to generally as network impairments. During network impairments, users may continue to have access to data stored on their local file server, but remote users will have no access to this data.
One approach to overcoming network impairments is to replicate data on file servers at different locations. The replicated data may include multiple copies of files, groups of files, or parts of files. This data replication offers the opportunity for access to the replicated data at multiple locations despite network impairments. There have been many proposed systems for replicating data. However, all of these approaches have significant limitations and are often complicated to configure, to manage, and to use.
File caching systems can be used to replicate data from remote file servers. However, file caching systems often require modifications to work with existing applications and servers. A naming system, such as Microsoft DFS, can introduce a level of indirection that avoids requiring modification of clients or servers. Unfortunately, configuration of naming systems for such purposes is complicated and error-prone. Additionally, the failure of the naming system is an additional cause of network impairments.
Traditionally, some file caching systems do not allow for modification of replicated data. Instead, all modifications must be done to a single “master” version of the data. This ensures that the replicated data is consistent. Other file caching systems allow for modification of local copies of data, rather than a master copy, by introducing complex file leasing and locking controls. Examples of such systems are Cisco Systems WAFS and Tacit Networks IShared. In such systems, a user “leases” access to a copy of the data for a limited period. During this period, the user can modify this copy of the data without restriction. During the lease period, all other copies of the data on other file servers are “locked,” so that no other users can modify their copies of the data. Once the user's lease expires, the other copies of the data are updated to reflect any changes made by the user. The downsides with these systems include the added complexity and overhead of managing the leases and locks on data and the need to modify servers and applications to handle locked files properly. Additionally, network impairments can interfere with accessing and/or modifying data. Sometimes systems allow multiple copies of data to be modified simultaneously, especially in the presence of network impairments; however, such an arrangement leads to additional complexity and potential errors when the network impairment ends and multiple differing copies of nominally-identical data must be reconciled.
Another approach to improving access to data is using pre-positioning content distribution systems, such as the service provided by Akamai or the Cisco ECDN or ACNS products. These systems allow the files to be moved out to multiple edge servers where they can be served efficiently. These systems also support forms of redirection based on DNS or HTTP so as to spread requests to multiple servers and tolerate a variety of server and network failures. However, these systems typically allow only read access to the replicated data and cannot support any kind of modification to the files that are distributed. Thus, they are unsuitable for collaboration applications in which multiple users create, read, and modify data.
The problems of network impairments on data sharing arrangements is exacerbated by the tendency to move many file servers to a small number of data centers. This reduces the cost and complexity of managing the file servers, but increases the system's vulnerability to network impairments.
It is therefore desirable to have a data distribution system and method that replicates data efficiently and allows data to be accessed during network impairments with minimal disruption to users. It is further desirable that the system be simple to configure and manage. It is also desirable for the system to integrate with applications and servers without requiring modifications.