This application is a continuation of U.S. patent application Ser. No. 11/232,438, entitled “Provisional Authority in a Distributed Database,” filed on Sep. 21, 2005 and issued as U.S. Pat. No. 8,250,030 B2 on Aug. 21, 2012, which is incorporated herein by reference in its entirety.
FIG. 1A is a diagram illustrating a centralized distributed database system 100. System 100 is shown to include master 102 and members 103 including member A 104, member B 106, and member C 108 of system 100. Reads can be performed at any node. For example, each node maintains a read only cache. Writes must be performed through master 102. Write requests are sent to master 102, and the database in master 102 is updated. The data is replicated to the members by propagating the changed data (e.g., changed columns and/or rows) to each of members 103. Each member receives the data and places it in its cache (or local version of the database). This approach can be bandwidth intensive when a large amount of data needs to be propagated. For example, if 5 million records with “infoblox.com” need to be changed to “infoblox.xyz.com,” those 5 million changed records would need to be shipped. A centralized database is difficult to scale. All changed data is transmitted, which can consume high bandwidth and can overwhelm the system. All writes must go through the master, which increases latency, particularly in a highly distributed system. In addition, the master can become a bottleneck.
FIG. 1B is a diagram illustrating a partitioned distributed database system 120. System 120 is partitioned into three portions (122, 130, 140) each with a local master and members. Each master has full write authority for that partition. However, writes to each partition are generally not coordinated. For example, a host name may be added to partition 140 that may already exist in partition 122, resulting in inconsistent data between the two partitions. Some sort of coordination between each partition is needed if such inconsistencies are to be avoided. If a single overall master is selected, that master could become a bottleneck since it would need to approve all transactions. It would be desirable to have a faster and more scalable distributed database.
In addition, a member can comprise a high availability (HA) pair, or an active node and a passive (or redundant) node, where the passive node serves as a backup to the active node in case of failure. Currently data is not reliably consistent between the active and the passive nodes. Thus, if there is a failure of one node, there can be a loss of data. It would therefore also be desirable to have a more reliable distributed database.