Data replication is a widely used technique for storing copies of original data at multiple locations in order to improve data availability and overall system performance. Data replication is used in a variety of information technology systems, including content delivery networks, peer to peer storage and exchange, application-level multicast and distributed data backup among others.
In many such systems, the original data located at one node need to be replicated to a set of other nodes. All these nodes are usually organized in a tree hierarchy, and the node having the original data is the root of the hierarchy. The data are replicated over each link of the hierarchy periodically. For example, the root node periodically sends its data to its child nodes, each of which periodically sends the data received from its parent to its children. In such a way, data are refreshed throughout all nodes in the hierarchy.
One information technology system where data replication is used and data consistency needs to be maintained is a resource discovery system. In order to execute applications or services in a computing system such as a networked computing system, resources within the computing system and also external to the computing system need to be allocated among these applications. These resources include computing hardware resources, e.g. central processing unit (CPU) resources, storage capacity, such as hard drive size, memory size in physical machines and data collectors or data sensors. The available resources can include both static and dynamic resources. For example, the memory size or network adaptor speed of a given machine is usually fixed, and the available memory or bandwidth changes frequently over time.
In order to allocate resources among a variety of contemporaneous resource demands, a repository of available resources needs to be created and maintained. Creation and maintenance of this repository includes discovering resources that are available for allocation. Resource discovery can locate remote resources subject to the specified requirements of a given resource demand and is widely used in many distributed computing systems for a variety of applications. For example, in grid computing, machines or nodes that possess the required CPU and memory resources to run an application are discovered or identified, and then the application is deployed on those identified machines.
A variety of approaches to resource discovery have been proposed. These proposed approaches include the domain name system (DNS) as described in P. Mockapetris & K. J. Dunlap, Development of the Domain Name System, Proceedings of SIGCOMM'88, Stanford, Calif., pp. 123-133 (1988), the lightweight directory access protocol (LDAP/X.500) as described in M. Wahl, T. Howes & S. Kille, RFC 2251-Lightweight Directory Access Protocol (v3), December (1997), ITU-T, Recommendation X.500, January (2001) and D. W. Chadwick, Understanding X.500—The Directory (1996), and the java naming and directory interface (JNDI) as described in Sun Microsystems, Java Naming and Directory Interface—JNDI Documentation, http://java.sun.com/products/jndi/docs.html. All of these systems provide directory service to discover resources; however, these previous attempts at resource discovery were arranged mostly for static resources or resources that change quite slowly, for example host name to internet protocol (IP) address mapping. Support for dynamic resources that vary frequently, for example on the scale of tens of minutes or less, using these systems is very limited. In addition, these systems assume the space or universe of available resources is globally organized into a pre-defined tree hierarchy that is managed in a delegated manner. That is, each organization agrees on such a hierarchy and “owns” a portion of the tree.
Global organization and management of resources, however, may not exist. In addition, global organization introduces complexity and restrictions into the allocation of resources. For example, it can be difficult to pre-define the resource hierarchy if new types of resources are to be added in the future. Due to administrative and trust reasons, autonomous systems may have different perceptions on how resources should be organized. Systems that already employ different resource discovery services need to collaborate for common tasks, but it is very difficult to change the individual, legacy resource discovery services.
One scalable wide-area resource discovery tool (SWORD) is described by David Oppenheimer, Jeannie Albrecht, David Patterson, and Amin Vahdat in Distributed Resource Discovery on PlanetLab with SWORD, First Workshop on Real, Large Distributed Systems (WORLDS '04), December 2004. This resource discovery service was created for PlanetLab as described by Larry Peterson, Tom Anderson, David Culler, and Timothy Roscoe in A Blueprint for Introducing Disruptive Technology into the Internet, July 2002. The resource discovery tool employed by SWORD utilizes a distributed hash table (DHT) based peer-to-peer network to support multi-dimensional range queries on dynamic resources. One disadvantage of using a peer-to-peer network is that the management of the system is challenging. Peer-to-peer networks are arranged to allow high autonomy of individual nodes, making it quite difficult to facilitate, especially centralized, control and management in the system. In addition, the resource discovery tool in SWORD requires that each individual autonomous system export its complete resource records to the peer-to-peer network. This can become a problem due to trust issues. Individual autonomous systems may not be willing to expose their original records to the outside world.
Creation and maintenance of the repositories of available resources consume overhead, i.e. system resources. A greater quantity of maintenance produces a greater cost on system overhead. This maintenance includes providing reliable and consistent data that reflect the most recent and accurate information about system resources. Frequent updates, however, consume more system resources. Current methods for maintaining the repositories of available system resources do not balance the desire for the most up to date data against the desire to minimize system overhead. Therefore systems and methods are needed that provide for the creation, maintenance of repositories of resources for the purposes of allocating these resources among a variety of resource demands such that an acceptable level of freshness is provided in the repositories while the consumption of system overhead is optimized.