Modern computer data storage applications rely on the storage of high volumes of data in a redundant, fault tolerant manner. For example, archiving of document images requires such storage.
To this end, databases that are distributed and allow for redundancy are known. Typically, the database is hosted on multiple physical computers, and is either replicated or mirrored. In this way, multiple instances of the data or even the entire database may be maintained. In the event one instance of the database fails, or is lost, the other instance may be accessed.
One known database architecture designed for the storage of large amounts of data, integrates multiple autonomous database systems into a single database—referred to as a federated database. In this way, conventional smaller databases using readily available software and hardware may be arranged to co-operate and be combined to form a single, larger logical database. Federated databases are, for example, described in McLeod and Heimbigner (1985). “A Federated architecture for information management”. ACM Transactions on Information Systems Vol 3, Issue 3: 253-278, and “Sheth and Larson” (1990). “Federated Database Systems for Managing Distributed, Heterogenous and Autonomous Databases”. ACM Computing Surveys Vol 22, No. 3; 183-236, and Barclay, T., Gray, J., and Chong W., “TerraServer Bricks—A High Availability Cluster Alternative” (2004), Microsoft Research Technical Report MSR-TR-2004-107.
As data is replicated across multiple instances of the databases, maintaining coherency between the instances of the database, and ensuring that only up-to date data is used presents challenges. These challenges become more pronounced, in a federated database as the number of autonomous database systems increases.
Accordingly, there remains a need for methods, and software for maintaining data consistency in a replicated database system formed from one or more federated databases.