Replication, known as "mirroring" in Internet parlance, is a technique that is used to address a scalability problem of popular Internet sites. As a popular site experiences a high rate of requests for objects stored at the site, the site can become overburdened and slow to respond, or even crash. As used herein, the term "object" refers to a piece of information. An object is embodied as a "replica," e.g., a file that is stored at a host, or a program (e.g., executable) and associated files that produce a piece of information. Examples of replicas include a page at a web site, a graphic file, an audio file, a cgi-bin file, etc. A request for an object is answered by sending a copy of a replica to the requester.
To solve the scalability problem, replicas of the requested objects can be stored at several locations throughout the network, thereby spreading the load of sending copies of the replicas of requested objects to requesting users.
It is important to properly decide where to store the replicas, and how to allocate requests for objects among the sites at which the replicas are stored. Often, these two problems are related in that a placement strategy will have important implications for the request allocation strategy, and vice versa.
Certain known replication (mirroring) techniques are implemented manually by system administrators, who monitor the demand for information on their sites and decide what data should be replicated and where the replicas should be stored. This task becomes daunting when the number of objects that can be requested and possible storage sites for replicas of such objects become large. Such a situation can arise, for example, in networks that are used to provide hosting services. Generally, a hosting service maintains and providing access to objects belonging to third-party information providers. For example, a hosting service may provide the infrastructure numerous web sites whose content is provided by third parties.
As the scale of a hosting system increases (i.e., as the number of objects and hosting servers on which replicas of the objects are stored becomes larger), the decision space for replica placement increases. A brute-force, worst case design becomes prohibitively expensive, and the problem of mirroring becomes too large and complex to be effectively handled manually by system administrators. Without appropriate new technology, system administration related to replica placement may become a factor limiting the scale to which hosting platforms may efficiently increase. This new technology must be able to automatically and dynamically replicate Internet objects in response to changing demand.
Some known protocols allocate requests among hosts that store mirrored objects by collecting load reports from the hosts and weighing host loads into a network-topology-based request distribution scheme. This approach, implemented in the Local Director made by CISCO Systems of California, is not well suited for dynamic replication on a global scale. This is because the request re-direction subsystem is highly distributed, forcing each host to send its load report to a large number of redirecting servers. This disadvantageously increases network traffic and can function poorly if the load reports are delayed in reaching all of the request redirectors. Further, request distribution for a given object becomes dependent on the popularity of many other objects that are co-located at the same host. This renders request distribution effectively non-deterministic and unpredictable, greatly complicating autonomous replica placement decisions.
Other known commercial products offer transparent load balancing among multiple Internet sites. See CISCO Distributed Director White Paper, &lt;http://www.cisco.com/warp/public/734/distdir/dd_wp.htm&gt;; IBM Interactive Network Dispatcher,&lt;htttp://www.ics.raleigh.ibm.com/netdispatch/&gt;; Web Challenger White paper, WindDance Network Corporation,&lt;http://www.winddancenet.com/newhitepaper.html&gt;, 1997. These products differ in the network level where the redirection of requests to physical replicas occur: CISCO's Distributed Director performs re-direction at the DNS level. A similar idea is used in E. Katz, M. Butler, and R. McGrath, A Scalable Web Server: The NCSA Prototype, Computer Networks and ISDN Systems, 27, pp. 155-164, September 1994, May 1994. The IBM Net Dispatcher and CISCO's Local Director redirect requests at the front-end router level, while Winddance's Web Challenger does so at the application level using redirection features of the HyperText Transfer Protocol (HTTP). None of these products offer dynamic replication or migration of replicas.
Existing protocols for performance-motivated dynamic replication rely on assumptions that are unrealistic in the Internet context. Wolfson et al propose a ADR protocol that dynamically replicates objects to minimize communication costs due to reads and writes. O. Wolfson, A. Jajodia, and Y. Huang, An Adaptive Data Replication Algorithm, ACM Transactions on Database Systems (TODS), Vol. 22(4), June 1997, pp. 255-314. Most Internet objects are rarely written. Recent trace studies (e.g., S. Manly and M. Seltzer, Web Facts and Fantasy, in USENIX Symp. on Internet Technologies and Systems, pp. 125-134, 1997) consistently show that 90% of requests are to static objects, and many of the remaining objects are dynamically generated responses to read-only queries. Therefore, minimizing communication costs due to reads and writes is not a suitable cost metric for the Internet. In addition, the Wolfson protocol imposes logical tree structures on hosting servers and requires that requests travel along the edges of these trees. Because of a mismatch between the logical and physical topology of the Internet, and especially because each node on the way must interpret the request to collect statistics (which requires in practice a separate TCP connection between each pair of nodes), this would result in impractically high delays in request propagation.
Heddaya and Mirdad's WebWave dynamic replication protocol was proposed specifically for the World Wide Web on the Internet. A. Heddaya and S. Mirdad, WebWave: Globally Load Balanced Fully Distributed Caching of Hot Published Documents, in Proc. 17th IEEE Intl. Conf. on Distributed Computing Systems, May 1997. However, it burdens the Internet routers with the task of maintaining replica locations for Web objects and intercepting and interpreting requests for Web objects. It also assumes that each request arrives in a single packet. As the authors note, this protocol cannot be deployed in today's networks.
Algorithmically, both ADR and WebWave decide on replica placement based on the assumption that requests are always serviced by the closest replica. Therefore, neither protocol allows load sharing when a server is overloaded with requests from its local geographical area. Objects are replicated only between neighbor servers, which would result in high delays and overheads for creating distant replicas, a common case for mirroring on the Internet.
Also, ADR requires replica sets to be contiguous, making it expensive to maintain replicas in distant corners of a global network even if internal replicas maintain only control information.
The works of Bestavros (A. Bestavros, Demand-based Document Dissemination to Reduce Traffic and Balance Load in Distributed Information Systems, in Proc. of the IEEE Symp. on Parallel and Distr. Processing, pp. 338-345, 1995) and Bestavros and Cunha (A. Bestavros and C. Cunha, Server-initiated Document Dissemination for the WWW, Bulletin of the Computer Society technical Committee on Data Engineering, pp. 3-11. Vol. 19, No. 3, September 1996) appear to be the predecessors of WebWave. A. Bestavros, Demand-based Document Dissemination to Reduce Traffic and Balance Load in Distributed Information Systems, in Proc. of the IEEE Symp. on Parallel and Distr. Processing, pp. 338-345, 1995 proposes to reduce network traffic within an intranet by caching organization's popular objects close to the intranet's entry point. In a very large scale system, there would be many such entry points. Such a system would address the problems of choosing entry points at which to place object replicas and allocating requests to those replicas. These questions are not considered in A. Bestavros, Demand-based Document Dissemination to Reduce Traffic and Balance Load in Distributed Information Systems, in Proc. of the IEEE Symp. on Parallel and Distr. Processing, pp. 338-345, 1995. In A. Bestavros and C. Cunha, Server-Initiated Document Dissemination for the WWW, Bulletin of the Computer Society Technical Committee on Data Engineering, pp. 3-11, Vol. 19, No. 3, September 1996, Bestavros and Cunha discuss the benefits of replicating popular objects from the host server up the request tree, but no methods for doing so are described.
Baentsch et al (M. Baentsch, L. Baum, G. Molter. S. Rothkugel, and P. Sturm. Enhancing the Web's Infrastructure: From Caching to Replication, IEEE Internet Computing, Vol 1, No. 2, pp. 18-27, March/April, 1997) propose an infrastructure for performing replication on the Web, without describing methods for deciding on replica sets. Also, the infrastructure assumes gradual learning of the replica set by clients, which may hurt the responsiveness of the system. Gwertzman and Seltzer (J. Gwertzman and M. Seltzer. The Case for Geographical Push-Caching, Proc. Of the HotOS Workshop, 1994. Also available at &lt;ftp://das-ftp.harvard.edu/techreports/tr-34-94.ps.gz&gt; motivate the need for geographical proximity-based object replication. They propose to base replication decisions on the geographical distance (in miles) between clients and servers. This measure may not correctly reflect communication costs for fetching an object, since the network topology often does not correspond to the geographical distances.
The problem of placing objects in the proximity of requesting clients has also been addressed in research on file allocation (see .PHI. Kure, Optimization of File Migration in Distributed Systems, Ph.D Dissertation, University of California (Berkeley), 1988. Also available as Technical Report UCB/CSD 88/413, Computer Science Division (ECCS), University of California (Berkeley), April 1988 for an early survey; and B. Awerbuch, Y. Bartal, and A. Fiat. Competitive Distributed File Allocation, In Proc. Of the 25th ACM Symposium on Theory of Computing, pp. 39-50, 1992; B. Awerbuch, Y. Bartal, and A. Fiat, Distributed Paging for General Networks, In Proc. of the 7th ACM-SIAM Symposium on Discrete Algorthms, pp. 574-583, January, 1996; and Y. Bartal, A. Fiat, and Y. Rabani, Competitive Algorithms for Distributed Data Management, in Proc. Of the 24th ACM Symposium on Theory of Computing, pp. 39-50, 1992 for more recent work). Early work in this area assumes a central point where decisions on object placement are made by solving an integer programming optimization problem. Even when the search space is heuristically pruned, the scale of our application would make such approaches impractical. Also, this approach requires the decision-making point to have complete information on network topology, server loads, and demand patterns.
More recently, the problem of obtaining distributed solutions for file allocation has been addressed. See B. Awerbuch, Y. Bartal, and A. Fiat, Competitive Distributed File Allocation, in Proc. Of the 25th ACM Symposium on Theory of Computing, pp. 164-173, May, 1993; B. Awerbuch, Y. Bartal, and A. Fiat, Distributed Paging for General Networks, in Proc. Of the 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 574-583, January, 1996; Y. Bartal, A. Fiat, and Y. Rabani, Competitive Algorithms for Distributed Data Management, in Proc. Of the 24th ACM symposium on Theory of Computing, pp. 39-50, 1992. In B. Awerbuch, Y. Bartal, and A. Fiat, Distributed Paging for General Networks in Proc. Of the 7th ACM-SIAM Symposium on Discrete Algorithms, pp. 574-583, January, 1996, Awerbuch, Bartal, and Fiat design a distributed file allocation protocol and use the framework of competitive analysis (see D. Sleator and R. Tarjan. Amortized Efficiency of List Update and Paging Rules, Communications of the ACM, 28(2): 202-208, 1995) to show that their protocol is nearly optimal in terms of total communication cost and storage capacity of the nodes. However, they do not address the issue of load balancing among different servers. Moreover, while their work is significant from a theoretical standpoint, several issues concerning implementation of their protocol over the Internet are not addressed.