The present invention generally relates to a method and system for data replication and replica distribution, and more particularly to a method and system for generation of bounded-length globally unique replica identifiers.
Conventional systems for replicating data on multiple computing devices and synchronizing the replicas require a way to identify each replica uniquely. For example, Parker et al., xe2x80x9cDetection of Mutual Inconsistency in Distributed Systemsxe2x80x9d, IEEE Transactions on Software Engineering SE-9, No. 3 (May 1983), pp. 240-247, describes the use of version vectors, which map replica identifiers to readings of logical clocks maintained at each replica, for tracking the relationships among different versions of data in a distributed system.
Petersen et al., xe2x80x9cFlexible Update Propagation for Weakly Consistent Replicationxe2x80x9d, SIGOPS ""97: Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, Oct. 5-8, 1997, Saint-Malo, France, pp. 288-301, describes replica creation in the Bayou system. In this system, each replica of a database, other than the original one, is created as a copy of some existing replica. Thus, each replica, other than the original one, can be thought of as a child of a particular parent replica. (This approach precludes scenarios in which two independently created collections of data are merged by synchronizing them with each other, so that they become replicas of each other, each holding the union of their previous contents.) Since the device holding the child replica must contact the device holding the parent replica to obtain its initial contents, the child replica can also obtain its identifier from the device holding the parent replica. The identifier for the child replica, a two part identifier, includes the identifier of the parent replica and an integer uniquely distinguishing the new child replica from all other children of that parent.
U.S. Pat. No. 5,884,322 to Sidhu et al. describes a system in which a parent allocates a set of unique replica identifiers to a child. This set consists of the replica identifier for the child itself and the replica identifiers that can potentially be generated for the descendants of the child. The child, in turn allocates subsets of this set to its own children. However, Sidhu et al. does not stipulate the use of fixed-length identifiers, a specific mechanism for the distributed generation of unique identifiers, or a strategy for partitioning the available identifiers based on an entity""s generation number. Similarly, U.S. Pat. No. 5,522,077 to Cuthbert et al. describes a system in which a client obtains a range of contiguous unique-identifier values from a central server, and may later return unused values to the server for reallocation.
U.S. Pat. No. 5,414,841 to Bingham et al. describes another centralized system in which unique identifiers can vary in length. Similarly, the replica identifiers described by Petersen et al. (supra) are not fixed length. Since each replica""s identifier includes, as a component, the identifier of its parent replica, the length of a replica identifier grows with each generation of replicas. (If a tree is constructed to describe parent-child relationships, a generation corresponds to a level of the tree.) This is acceptable in an environment where a few replicas serve as parents of many children, so that there are few generations. However, in an environment in which many child replicas are themselves parents (so that a tree describing parent-child relationships has many levels), replica identifiers and the data structures that incorporate them, such as version vectors,.can quickly grow unacceptably large.
Bingham et al. (supra), Cuthbert et al. (supra), and the Cedar software version management system (U.S. Pat. No. 4,558,413 to Schmidt and Lampson) rely on central generation to ensure uniqueness of identifiers. Similarly, U.S. Pat. No. 5,640,608 to Dockter et al. describes the operation of a central tag server that generates tag values in ascending order, hands off blocks of unique tag values to clients in shared volatile storage, and keeps a nonvolatile record of the highest tag value generated so far, ensuring that duplicate tag values will not be generated upon recovery from a failure that destroys the contents of volatile storage. U.S. Pat. No. 5,304,992 to Harashima describes the distribution of centrally generated unique identifiers over a common communication line.
However, there are several advantages to assigning unique replica identifiers without coordinating through a central server:
The elements of the distributed system may have a peer relationship, in which no one device is designated as a server, and there may be no direct connection between an arbitrary pair of devices.
Reliance on coordination with a particular device (e.g., a central server or the like) means that a new replica cannot be created when that device is unavailable (e.g., because either the particular device or the network has failed).
Reliance on coordination through a central server precludes creating new replicas within mobile work groups (groups of colleagues traveling together whose devices are accessible to each other but not to a fixed network).
Channeling all requests for new replica identifiers through a central server can create a system bottleneck.
However, distributing (i.e., decentralizing) the assignment of replica identifiers to various devices raises the problem of ensuring that the generated identifiers are unique (i.e., that two devices unable to contact each other do not generate the sane identifier). Pouzin and Zimmerman, xe2x80x9cA Tutorial on Protocolsxe2x80x9d, Proceedings of the IEEE 66, No. 11 (November 1978), pp. 1346-1370, discusses the distributed generation of unique identifiers by hierarchical concatenation of a unique domain identifier (identifying, for example, a host computer) with an identifier that is unique within that domain. Richard W. Watson, xe2x80x9cIdentifiers (Naming) in Distributed Systemsxe2x80x9d, in Lampson and Siegert, eds., Distributed Systemsxe2x80x94Architecture and Implementation, an Advanced Course, Springer-Verlag, New York, 1981, classifies hierarchical concatenation as a special case of an identification scheme built around a sequence of contexts, in which an identifier for one context is mapped by that context into an identifier meaningful in the next context. Watson describes a distributed architecture in which globally unique identifiers consist of a server process address and a local name unique among the names generated by a server process. There may be many server processes generating unique identifiers, each having a unique server address assumed to be already known to the interprocess communication service that delivers a request for a new unique identifier to the server process.
Many inventions are concerned with the distributed generation of unique identifiers by hierarchical concatenation. U.S. Pat. No. 5,117,351 to Miller discloses a system for distributed generation of unique identifiers. Each unique identifier consists of a node identifier uniquely identifying a node of the distributed system, a monotonically increasing per-node time stamp (based on the node""s system clock, but incremented when necessary to avoid duplicate use of the same time stamp), a random number used to reduce the probability of generating a duplicate identifier in case the system clock is reset, and a version number identifying the version of the system that generated the identifier, to prevent collisions with identifiers that may be generated by different mechanisms in the future. U.S. Pat. No. 4,792,921 to Corwin is similar, supporting distributed generation of identifiers containing a node identifier (in this case, a telephone number associated with the node) and a time stamp based on a real-time clock that is posited to tick at least once between the generation of any two identifiers. U.S. Pat. No. 5,815,710 to Martin et al. describes the generation of unique identifiers consisting of a network address such as an IP address, a process identifier, a time stamp, and a counter. U.S. Pat. No. 5,732,282 to Provino et al. describes the use of device-driver registries with globally unique identifiers to avoid the need to preassign globally unique identifiers for each registered device driver.
Each of these hierarchical approaches relies on preexisting unique identifiers (node identifiers or network addresses) that form part of the generated unique identifier. However, the generation of globally unique device identifiers is itself problematic. The use of Ethernet host identifiers for Ethernet hosts as the global device is applicable only to devices that contain Ethernet cards. Furthermore, because of privacy concerns (e.g., see Markoff, J., xe2x80x9cGrowing Compatibility Issue: Computers and User Privacyxe2x80x9d, New York Times, Mar. 3, 1999), there is a widespread public resistance to xe2x80x9cburningxe2x80x9d (permanently imprinting or associating) a unique identifier into each computing device as it is manufactured and transmitting that identifier in network interactions. IP addresses are inadequate as unique device identifiers because the same IP address may refer to any of several different devices, or to different devices at different times. Indeed, some devices are dynamically assigned new IP addresses each time they log on to the network, or at periodic intervals.
Another approach is to generate a relatively long random number (i.e., one consisting of many bits) for use as a replica identifier. If the random number is sufficiently long and truly random, the chance of the same identifier being generated twice will be minuscule. For example, the probability of two randomly generated 64-bit numbers being identical is less than one in 16 billion billion (arguably less than the probability of a distributed system failing because of a meteor colliding with the earth). However, this solution presumes that the random-number generator has an equal probability of producing each possible 64-bit number. It is difficult to guarantee that a random-number generator will have this property. Thus, the chance of generating duplicate identifiers may, in practice, be much larger. The probability of interest is not the probability that two randomly generated numbers are distinct, but that all members of a set of randomly generated numbers are distinct from each other. This probability increases rapidly as the size of the set grows (a phenomenon known as the xe2x80x9cbirthday paradoxxe2x80x9d because the probability of all people in a room having distinct birthdays decreases rapidly as the number of people in the room increases, falling below 50% as soon as only 24 people are in the room). The distributed generation of duplicate random numbers is not easily detected. Indeed, it is likely to xe2x80x9cbreakxe2x80x9d the distributed system in a way that may be difficult to repair. However small the probability of generating duplicate replica identifiers, a disconcerting solution is provided by a system whose proper function relies on luck rather than mathematically provable guarantees.
In view of the foregoing and other problems, disadvantages, and drawbacks of the conventional methods, it is an object of the present invention to generate replica identifiers whose lengths are bounded by a fixed, predetermined number of bits, and to ensure the uniqueness of these identifiers without coordinating with a central entity.
In a first aspect, a method for uniquely identifying one of a plurality of replicas of data includes creating a second replica from a first replica, computing a replica identifier of a fixed length for the second replica by applying a uniform transformation to a replica identifier of a fixed length for the parent. The first replica is a parent replica of the second replica, and the uniform transformation is dependent upon a record of the parent replica describing a set of replica identifiers previously assigned to replicas of the parent.
In a first exemplary method, the present invention generates b-bit replica identifiers in a compact manner, in the sense that each of the 2b possible bit combinations is a well-formed replica identifier.
First, the initial replica is identified by the integer 0, and its first child is identified by the integer 1. If a replica is identified by some integer m greater than 0, its first child is identified by the integer 2m. If, on the other hand, a replica already has one or more children, the last of which is identified by the integer n, the next child of that replica is identified by the integer 2n+1. Replica creation will fail if the new replica""s identifier exceeds 2bxe2x88x921.
This exemplary method of the present invention combines several conventional approaches in a unique and unobvious way. Specifically, in describing the Heapsort algorithm and its relationship to the tree-based Tournament Sort algorithm of Iverson et al., A Programming Language, John Wiley and Sons, New York, 1962), Williams, xe2x80x9cAlgorithm 232: Heapsortxe2x80x9d, Communications of the ACM 7, No. 6 (June 1964), pp. 347-348, implicitly exploits a compact numbering scheme for nodes of a binary tree, in which the root is numbered 1 and a node numbered n has a left child numbered 2n and a right child numbered 2n+1.
Section 2.3.2 of Knuth, The Art of Computer Programming, Volume 1: Fundamental Algorithms, 2nd ed., Addison-Wesley, Reading, Mass., 1973, describes a natural correspondence between binary trees and forests of arbitrary trees. This correspondence, in effect, equates the left-child link of a binary-tree node with a link to the first child of a node in the arbitrary tree, and the right-child link of a binary-tree node with a link from a node in the arbitrary tree to its next sibling.
The exemplary method of the invention combines conventional approaches in a novel way because the replica identifier associated with a node in the replica tree is the Williams (supra) numbering of the corresponding node in the naturally corresponding binary tree of Knuth (supra).
In a second exemplary method, the invention imposes a bound on the number of generations, and generation-specific bounds on the number of children for a replica in a given generation. A replica identifier is divided into a plurality of fields, each corresponding to one generation number. The number of fields is fixed, as is the number of bits in the field corresponding to a given generation number. For a replica with a given generation number, each field of its identifier corresponding to a higher generation number contains a special reserved value, such as zero. The identifier for a child replica is derived from the identifier of its parent replica by replacing the reserved value in the field corresponding to the child""s generation number with some other value, distinct from the values used in this field for other children of the same parent. Replica creation fails if the generation number for the new replica would exceed the predetermined maximum generation number, or if all unreserved values for the field corresponding to the new replica""s generation number have already been assigned to other children of the new replica""s parent.
This encoding leaves some of the possible bit combinations unused. Any bit pattern in which one field contains the reserved value, but another field, corresponding to a higher generation number, does not, cannot be a replica identifier.
Typically, an early-generation replica will reside on a server and have many copies made of it, but a late-generation replica will reside on a personal device and have few copies made of it. The invention can take advantage of this pattern by using wider fields for earlier generations and narrow fields for later generations. Many combinations of field widths are possible. For example, a 64-bit replica identifier might have five fields of (in increasing generation order), 30; 16, 10, 5, and 3 bits. Then, the original replica could have up to 268,435,455 first-generation children, each first-generation replica could have up to 65,535 second-generation children, each second-generation replica could have up to 1,023 third-generation children, each third-generation replica could have up to 127 fourth-generation children, and each fourth-generation replica could have up to 7 fifth-generation children.
The present invention is applicable not only to replicated databases, but to any system in which it is necessary to identify replicas. Such systems include, for example, the distribution of intellectual property such as music or statistical reports over the Internet. In such contexts, a bound on the number of generations of replicas, and the number of replicas created at each generation, can be considered an important feature rather than a limitation.
The above-discussed transformations from parent replica identifiers to child replica identifiers are invertible. That is, it is possible to determine the replica identifier of a parent from the replica identifier of a child. Thus, a given replica""s xe2x80x9cpedigreexe2x80x9d (i.e., the sequence of copies made, starting from the original replica, culminating in the creation of the given replica) can be determined from the given replica""s replica identifier. Knowledge of a replica""s pedigree can be helpful in determining the origin of unauthorized copies. Thus, a replica identifier makes an ideal xe2x80x9cwatermarkxe2x80x9d.
Thus, with the unique and unobvious features of the invention, the above-mentioned Petersen et al. approach of presuming that replicas are created as children of parent replicas, and deriving a child replica""s identifier from the identifier of its parent replica, is performed while restricting replica identifiers to a fixed length and while providing a unique identifier for each replica made without central coordination.