In computer systems, it may be desirable to store multiple copies of data at different storage locations for security and/or availability purposes. This type of data storage technology is hereinafter generally referred to as “data replication,” and will be discussed in greater detail below. For example, in a data replication system including a source data storage location, such as a source disk, and one or more clone data storage locations, such as one or more clone disks, data stored on the source may be replicated to each of the clones in a variety of ways.
One conventional method for replicating data in a computer system includes a process referred to as “data mirroring.” Data mirroring includes copying data located on a source disk to one or more clone disks so that in the event of a failure of the source disk, a current version of the data may be accessed by reading the data from any one of the available clone disks.
Another conventional data replication method includes mirroring data between a source and a clone until a user-defined point in time, hereinafter referred to as “a fracture” or “fracturing a clone,” wherein data is no longer mirrored to the fractured clone. Fracturing a clone (i.e. ceasing to mirror data from a source to a clone at a point in time) may be performed, for example, for backup purposes, where source data that has been replicated to a fractured clone is said to be “backed-up.” Backed-up data may be used, for example, to restore the data that existed on a source disk at the time of a fracture. For example, in a data replication system including a source disk and seven clone disks, a different clone may be fractured each day of the week so that, at any point in time, seven days worth of data is backed-up. Therefore, in the event of failure of the source disk, an operator may restore a snapshot-copy of the data on the source disk to any data as it existed on any day during the week preceding the backup restore.
It is appreciated, however, that in addition to ceasing the mirroring of data from a source to a clone at the time of a clone fracture, it may be desirable to record the memory locations of any changes made to the source disk after a fracture so that, in the event of a backup restore process, the minimum amount of data necessary to achieve accurate restoration of source data is restored. For example, in the system described above, the source and clone disks may each include multiple data extents or other logical units of data storage. Therefore, upon a fracture and subsequent change to a small minority of extents on the source disk, a log indicating whether an extent was changed after a fracture may record which extents were changed. This representation may be implemented, for example, by storing a binary value in a bitmap. By maintaining a record of changes made to the source, when a backup restore process is initiated, the record may be examined to determine the extents changed subsequent to the fracture so that only those extents may be copied back to the source. Alternately, in an implementation that does not include a record of changes made to the source as described above, during a backup restore process, the entire contents of the clone disk are copied to the source, including many redundantly identical extents. Because such a system introduces large and unnecessary inefficiencies during a backup restore process, many conventional data replication technologies employ some form of log as described above.
For purposes of the remaining discussion, it is assumed that a data replication system including a source and multiple clones is capable of being fractured and that a log of the changed extents is maintained. Furthermore, it is assumed that a plurality of write requests may be received simultaneously, where a write request is a request to record data to the source, which is replicated to one or more clones. It is further appreciated that the word “simultaneous” in this context includes near simultaneity (i.e. a burst of requests within a short time) as all operations executing in a computing environment are inherently executed in a sequential order at the physical level. One conventional method for processing multiple write requests includes imposing a sequential order upon the write requests by, for example, placing the write requests in a queue and executing each write request in the queue sequentially. In such an implementation, the execution of each write request may include writing data to an extent located on the source, as indicated by the write request, and subsequently, writing the data to the corresponding extents on each of the clones in sequence (i.e., serially). Thus, in a data replication system including x data storage devices (a source and x−1 clones) and y write requests, x*y logical operations are required in order to record the data indicated by y write requests to x data storage devices.
Another conventional method for executing multiple write requests also includes queuing the write requests. However, data indicated by each write request in the queue may be written to the source and replicated to the clones in parallel, thereby reducing the number of logical operations necessary to record the data indicated by the write requests to multiple data storage devices. Continuing the example discussed previously, wherein y write requests are received and directed toward data stored on x data storage devices, by writing to the source and its clones in parallel, only y operations are required in order to record to x data storage devices because data may be recorded simultaneously to x data storage devices for each write request.
One problem associated with conventional data replication systems is that performance in executing multiple write requests to an array of multiple data storage devices degrades as the number of simultaneously received write requests is received. In other words, the performance of conventional data replication systems does not scale with the number of write requests received and therefore can become a performance bottleneck in environments where a large number of write requests are received.
Another problem associated with conventional data replication systems is that recording tracking information regarding the extents to be changed by a given set of write requests may not be initiated until the data associated with the previous set of write requests has been written to the source and its clones. In other words, there is a synchronous relationship between the writing of data to the source and its clones for a first set of write requests and the logging of tracking information associated with a second set of write requests. This relationship creates a performance bottleneck associated with the processing and saving tracking information for multiple write requests to multiple data storage devices.
Accordingly, a need exists for improved methods, systems, and computer program products for improving the performance of data replication systems including multiple data storage devices and that receive multiple write requests.