Data mirroring is a technique wherein data is copied from a first location to one or more secondary locations contemporaneous with when the data is stored at the first location. The data copied from the first location to the one or more secondary locations is an exact copy of the data stored at the first location. Consequently, data mirroring is useful for both providing a backup of the mirrored data and recovering data after a disaster in a timely manner. Data mirroring is independent of whether data is being copied to a location that is either geographically close to or distant from the location being mirrored.
FIG. 1 is a block diagram illustrating a system 100 employing a first approach to data mirroring, wherein data stored at site A is being mirrored to site B. File server 130 synchronously replicates data stored in database 140 to database 142. Each time file server 130 processes a transaction issued by database server 120 that makes a change to a data block in database 140, file server 130 transmits a message reflecting the change to file server 132. Upon receiving the message, file server 132 updates data stored in database 142 to reflect the change made to database 140. Database 142 may be updated using a variety of techniques, such as either performing the same transaction to database 142 as was performed on database 140 or by updating non-volatile memory at database 142 to reflect the current state of data stored at database 140.
Clients, such as client 110 and client 112, may issue I/O requests to a database server to read or write data in a database. To ensure the consistency of databases 140 and 42, all clients in system 100 issue all I/O requests through database server 120 at site A, thus guaranteeing that all clients will have the same view of the data being mirrored, regardless of the site with which the client is associated.
The approach for data mirroring illustrated in FIG. 1 has several problems. First, all I/O requests from clients not associated with site A, such as client 112, may encounter a performance penalty because those clients must transmit their I/O request to a database server at a different site. Since all I/O requests from a client are routed through a single database server, which may be geographically distant from the requesting client, those clients who are located remotely may encounter a significant transmission delay associated with the I/O request. Further, the single database server will act as a bottleneck for all I/O requests from clients in system 100.
Second, if site A becomes inoperable, e.g., file server 130 crashes or becomes unavailable, then database server 120 and all clients in system 100 connecting to database server 120 will encounter a temporary loss of service until a backup system, such as site B, that replaces the failed system of site A becomes operational.
Third, in the event that file server 130 cannot replicate a write operation to file server 132, perhaps due to the communications link between file server 130 and file server 132 becoming inoperable, then care must be applied in determining whether database 140 or database 142 should be used as a backup system to recover from the encountered problem, as database 140 and 142 are no longer synchronized with each other since one or more write operations could not be replicated. A change made to a database will be lost if a database is chosen as a backup system and the chosen database does not reflect all write operations that have been performed on any database in the system.
FIG. 2 is a block diagram illustrating a second approach for data mirroring. As FIG. 2 depicts, each database stored at each site is partitioned into two or more partitions. For example, database 240 has partitions A and B′, and database 242 has partitions A′ and B. Data stored in partition A in database 240 is mirrored to partition A′ in database 242, and data stored in partition B in database 242 is mirrored to partition B′ in database 240. Database 240 is considered the primary site for partition A and database 242 is considered the primary site for partition B.
Requests from clients to write or read data may be performed locally (i.e., the client issuing the request and the database servicing the request are both in the same site) if and only if the request only involves data stored in the partition that is being mirrored at that site. For example, if client 210 issues a write or read request to a data block in partition A, then the request may be performed locally at database 240. However, if client 210 issues a write or read request to a data block in partition B, then database server 220 would route that request to file server 232 so the request can be performed at database 242. Partitioning data in this manner helps reduce the performance delay of processing a transaction against data in partitions where the primary site is the local site, although this technique does not reduce the performance delay of processing a transaction against data in partitions where the primary site is a remote site.
However, this approach is problematic if data cannot be replicated between sites or if a particular site becomes inoperable. When data cannot be replicated from a partition on a first site (the primary site) to a corresponding partition on a second site (the secondary site), the database at the primary site is not notified that the replication was not successful. As a result, partitions storing replicated data at the secondary site may grow stale and outdated. Thereafter, if the primary site becomes inoperable, then a partition storing replicated data at the secondary site cannot be used to recover from the inoperability of the primary site because the data stored therein is outdated. Use of the outdated data would violate database consistency principles.
Accordingly, there is an unaddressed need in the art to mirror data while avoiding the problems associated with the approaches described above.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.