As ever increasing numbers of computers are networked together on the Internet, the usefulness and importance of peer-to-peer (P2P) network applications and distributed databases have become evident.
A peer-to-peer network is generally thought of as a self-managed network of computers in which there is no single server or controller responsible for maintaining the network. A number of different architectures are available for creating peer-to-peer networks and applications. One such architecture is an overlay network. In general, overlay networks provide a level of indirection over traditional networking addresses such as Internet Protocol (IP) addresses. An important benefit of using an overlay network is that routing decisions can be made by application software.
FIG. 1A illustrates a typical overlay network. The computers (10) that belong to the overlay network route messages between each other, using the underlying network medium 11. While the underlying network medium has the information and capability to directly route messages between specific computers, overlay networks typically maintain only partial routing information and rely on successive forwarding through intermediate nodes in order to deliver a message to its final intended destination. One common use for overlay networks is in building distributed hash tables. Each computer name is run through a hashing algorithm (e.g., an MD5 hash) to generate a GUID (globally unique identifier). Each member of the overlay network stores a part of the distributed hash table. When a request or update for a document is sent from a node on the overlay network, the originating node hashes the requested document's filename, and then looks through its routing table entries to find the node whose ID is closest to the document's hash. The request is then forwarded to this closest intermediate node. The intermediate node follows the same process, comparing the document's hash with the intermediate node's routing table entries. The overlay network maintains enough information in its routing tables to be able to tell when a node's ID is closer to a document's hash than any other node's ID. That closest node is then responsible for storing the document and responding to queries for it.
Current examples of overlay network types for peer-to-peer networks include Tapestry developed at the University of California at Berkeley by Ben Y. Zhao, et al., Chord developed at the Massachusetts Institute of Technology, and Pastry, developed by Microsoft. Tapestry, Chord and Pastry are toolkits for building distributed systems.
Tapestry provides a peer-to-peer, wide-area decentralized routing and location network infrastructure. It is an overlay network that sits at the application layer (on top of an operating system). When deployed on separate machines in the network, Tapestry allows any node to route messages to any other node in the network, given a location and network independent name. Furthermore, any node in a Tapestry network can advertise or “publish” location information about objects it possesses, in a manner such that applications on other Tapestry nodes can find these objects easily and efficiently, given the object name. Tapestry forms individual machines into a true peer-to-peer network, without any points of centralization that might become points of failure or attack.
Pastry is a generic, scalable and efficient substrate for peer-to-peer applications. Pastry nodes form a decentralized, self-organizing and fault-tolerant overlay network within the Internet. Pastry provides efficient request routing, deterministic object location, and load balancing in an application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault recovery.
MIT's Chord project relates to scalable, robust distributed systems using peer-to-peer ideas. Chord is based on a distributed hash lookup primitive. Chord is decentralized and symmetric, and can find data using only log(N) messages, where N is the number of nodes in the system. There are other overlay systems in addition to these. For example, CAN, Kademlia and Viceroy are other systems that are similar. New overlay designs are appearing on a frequent basis.
Many existing systems such as Tapestry, Pastry and Chord typically depend on characteristics of hashing, although in slightly different ways. These include uniformly distributed identifiers, arithmetic in the identifier space, and fixed-length identifiers. Both Chord and Pastry depend on the first property for efficient operation. Chord depends on arithmetic in the identifier space to decide on its ‘fingers.’ Finally, Pastry depends on fixed-length identifiers in order to guarantee fixed-depth routing tables.
The use of hashing is clearly also integral in implementing distributed hash tables. The primary benefit of hashing is the uniform distribution of data among nodes. This feature is often touted as ‘load-balancing’ but it is only one simple aspect of a load-balancing design. Many overlay networks based on hashing lack locality features that are important for certain peer-to-peer applications. See for example: Pete Keleher, Bobby Bhattacharjee, Bujor Silaghi, “Are Virtualized Overlay Networks Too Much of a Good Thing?” (IPTPS 2002). Two such features that are useful for peer-to-peer systems, but that are difficult to implement in hash-based overlay networks are content locality and path locality.
Content locality refers to the ability to store a data item on a specific node. In a more coarse form, content locality is the ability to store a data item on any one of a specific set of nodes. It is not unusual for enterprises such as corporations or government agencies to implement complex network security measures to prevent sensitive documents from being distributed outside the entity's network. Thus, these enterprises are unlikely to use peer-to-peer applications that do not provide control over where particular documents are stored. For example, XYZ Corporation may want to ensure that certain documents are only stored on computers belonging to the xyz.com domain.
Path locality refers to the ability to guarantee that the routing path between any two nodes in a particular region of the network does not leave that region. The region may be a building, an administrative domain, etc. Using the example above, XYZ Corporation may desire to restrict sensitive messages from being routed outside the xyz.com domain. Using path locality, a message from UserA (usera@xyz.com) to UserB (userb@xyz.com) could be restricted such that it is only routed across computers in the xyz.com domain. This may be of particular importance if some of the other domains on the overlay network belong to competitors of XYZ Corporation.
Current hash-based systems do not inherently support content locality or path locality. Indeed, their whole purpose is to uniformly diffuse load across all machines of a system. Thus, the pervasive use of hashing in those systems may actually reduce or prevent control over where data is stored and how traffic is routed.
Thus, an improved system and method for creating overlay networks is needed. In particular, an overlay network capable of providing content locality is desired. An overlay network that is capable of providing path locality is also desirable. Furthermore, an overlay network that can provide the content locality and path locality features while retaining the routing performance of existing overlay networks is desirable.
The following references may provide further useful background information for the convenience of the reader:    [1] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan, “Chord: A scalable peer-to-peer lookup service for Internet applications,” Proc. ACM SIGCOMM'01, San Diego, Calif., August 2001.    [2] A. Rowstron and P. Druschel, “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems,” IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, pages 329-350, November 2001.    [3] Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker, “A Scalable Content-Addressable Network,” Proceedings of ACM SIGCOMM, San Diego, Calif., pp. 161-172, August 2001.    [4] Ben Y. Zhao, John D. Kubiatowicz, and Anthony D. Joseph, “Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and Routing,” U. C. Berkeley Technical Report.    [5] W. Pugh, “Skip Lists: A Probabilistic Alternative to Balanced Trees,” Communications of the ACM, vol. 33, no. 6, June 1990, pp. 668-676.    [6] W. Pugh, “A Skip List Cookbook,” Technical Report CS-TR-2286.1, University of Maryland, 1989.    [7] J. I. Munro, T. Papadakis and R. Sedgewick, “Deterministic skip lists,” Proc. 3rd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 367-375, 1992.    [8] Bozanis P. and Manolopoulos Y., “DSL: Accommodating Skip Lists in the SDDS Model,” Proceedings 3rd Workshop on Distributed Data and Structures (WDAS'2000), L'Aquila, 2000.