The present invention relates generally to computing systems in which there are kept a number of replicated databases, and in particular to a method for comparing the databases quickly and efficiently.
Of the many approaches to fault tolerant computing available today, one seems likely to be around for some time. That approach is to provide a computing environment comprising multiple processor units so that if one processing unit fails, another is available to takeover. One example of this approach can be found in U.S. Pat. No. 4,817,091 which teaches a multiple processor system in which a processor unit of the system that is detected as having failed, will have the tasks of that failed processor unit taken over by a backup processor unit (or processor units).
This multiple processor system, with the advent of a novel communication network (described U.S. Pat. No. 5,574,849), has been extended to a multiple processing system in which groups of processor units are communicatively interconnected to form a "cluster." Each group (sometimes referred to as a "node") of processor units forms a distributed processing system that provides multiple processing power and some modicum of fault tolerance in that the load of a failed processor unit can be taken up by the other processor units of the group or node. The cluster arrangement, in turn, provides additional fault tolerance by providing backup nodes of processor units should one of the nodes fail.
In such a clustered environment, as well as other environments, it is required to provide each node with information concerning the cluster (e.g., the location of processor units, peripheral units, etc.), its use, its users, and the like. Often kept in a database of one sort or another, the amount of this information can be quite large. This leads to problems when the databases of each node need to be checked, such as when a periodic check needs to be made to ensure the integrity of the database and the information it contains, or to ensure that changes to the database were made correctly. Such checks, however, can be very time consuming, and tend to impose a significant burden on system resources, particularly if such checks are frequently required. If the checks require communication between two nodes across a communication path, the amount of communication can be significant and create a bottleneck.
Thus, it can be seen that a way to check the integrity of databases in a quick, efficient, and trusted manner would benefit the overall performance of a multiple processor system using replicated databases of information. Resources needed elsewhere need be used for only the short time the check is conducted.