The present invention relates in general to an optimistic, robust, distributed data base system, and in particular to a system for merging virtual partitions in a distributed data base system.
The purpose of a database system is to reliably store data and to process valid, authorized read and write requests on that data. A database system provides a uniform interface for accessing data by executing transactions. A transaction is a unit of work comprising a sequence of steps required to achieve some goal. Each transaction consists of an inhibition phase; a read, compute, and write phase; and a commit phase. In the initiation phase, the transaction is authenticated and assigned some priority for being executed. In the read, compute, and write phase, the transaction carries out the requested action. The commit phase installs all of the results written by the transaction into the database making them visible to other transactions.
The purpose of a "distributed" database system is to provide global database service to users at a collection of sites. A distributed database system is a coalition of sites that share access to a set of data. Each site maintains directory information about the data so that each site has knowledge of all data stored by the database system. Copies of the data itself are distributed among the sites. The sites are connected by a communication network consisting of some physical communication medium and a communication protocol for controlling communication over that medium. This basic communiction protocol is part of the network services of an operating system.
A distributed database must, under normal operating conditions, maintain internal and mutual consistency of the data stored by the system. Internal consistency is maintained if all database updates performed by committed transactions are reflected in the database, none of the updates performed by the uncommitted transactions are reflected in the database, and no transaction is allowed to read a data value written by another transaction that is not yet committed. A distributed database system maintains mutual consistency of the database if given a data item, all replicated copies of the data item are the same. Mutual consistency is defined to mean, "all copies converge to the same state and would be identical should update activity cease".
A "robust" distributed database system is one that processes transactions and preserves internal and mutual consistency even when system components have failed. There are a number of types of component failures. A "site crash" is a machine failure due to loss of power, operating system deadlock or panic, processor malfunction, or human intervention to restart a site. A "network partition" occurs when two or more groups of sites are running but are unable to communicate. "Media failure" occurs when a storage device fails while reading or writing data, rendering some portion of the stored data unreadable. "Software failure" occurs when the internal consistency or the mutual consistency of the database has been compromised due to an error in the implentation of the database system or due to the occurrence of failure types not managed by the protocol implemented by the software.
Preserving internal consistency in the event of failures requires some mechanism for undoing results written by uncommitted transactions and ensuring the results written by committed transactions are reflected in the database. Preserving mutual consistency in the event of failures in the distributed environment requires that all sites acquire knowledge of new data items and new values for existing data items in as timely a manner as possible.
A "robust" system should provide users with continuous service while maintaining internal consistency and mutual consistency of the database. The traditional approach to building a robust system has been to extend failure free protocols for maintaining internal consistency and mutual consistency. However, degrees of robustness can be gained only at considerable cost in time and storage space. In addition, a system can be robust only in the event of failures it can detect and have anticipated detecting.
A robust distributed data base system is described in the paper "Robustness in Distributed Hypothetical Databases" by D. Ecklund, published 1985 in Proceedings of The Nineteenth Hawaii International Conference on System Sciences, and incorporated herein by reference. The system described in this paper establishes "virtual partitions" after system component failure prevents communication between various sites. Each virtual partition is a collection of sites that have access to a copy of the data base and which can still communicate with each other. The sites in each virtual partition continue to read and write access their copy of the data base even though they cannot communicate those changes to other sites in the system. The described system can support robust optimistic access control because under normal processing conditions the system ignores the possibility of update conflicts until they actually occur. If a failure partitions the network, the system may unknowingly process conflicting updates in separate partitions. In some sense the system ignores these conflicts until the failure is repaired and the partitions are brought back together. When the partitions are merged, the conflicting updates occur but are managed by implicitly deriving alternate versions. The general philosophy is that the results of conflucting updates will be by allowing one result to prevail over another.
However, the system described in the above mentioned paper is unsuitable for merging partitions in a hierarchical data base system wherein groups of data objects are grouped into "configurations" that themselves have version histories. While the system can resolve conflicting updates of low level data objects, it is unable to resolve conflicts between updates of configurations so as to provide a consistent version history of each configuration.