1. Field of the Invention
This invention relates to computer storage systems and, more particularly, to data mirroring and striping.
2. Description of the Related Art
In most cases, computer systems require data storage in one form or another. One type of computer system is a stand-alone system such as, for example, a single workstation running applications and storing data to files on a single disk drive or multiple disk drives that are directly connected to it. In such an environment, the workstation may use a local file system.
Frequently however, computer systems are deployed in a networked environment. In the networked environment, one or more client computer systems running user applications may be connected to one or more file servers which provide networked access to files used by the applications. Such a networked environment is referred to as a distributed file system.
An important feature of distributed file systems is high reliability of the file system. More particularly, it is important that the file system be as immune as possible to any system failures (crashes, power failures, etc.). If a system failure occurs, a less reliable file system may experience file corruption (e.g. if the failure occurred while the file was in the process of being updated). Repairing file corruption may be a complex and time consuming process, and may result in the loss of data. The lost data may result in a subsequent loss of productivity for the user who updated the file, and may even result in permanent loss if the lost data cannot be easily recreated by the user.
In addition to file system reliability, the access speed of the network data storage system is also important. To obtain storage system reliability, data mirroring is a well-known method. To increase system performance, data striping is a well-known method. Both of these methods are described in various literature pertaining to redundant arrays of inexpensive disks (RAID) architectures. Although mirroring does provide high reliability and striping does provide high performance, there are data coherency issues that must be addressed.
A problem may arise when two or more client computers are accessing the same mirrored device. It is possible that the writes to the mirror get processed out of order, thereby possibly causing inconsistencies in the stored data and the mirrored data. Likewise, when striping data across a disk array, it is possible for data writes from different clients to become interleaved, thereby possibly causing inconsistent data.
One solution to the above problems is to use a technique known as locking. Locking generally refers to allowing access to data by only one client at a time. In many applications, locking works. However, it may be a complex function when trying to recover data. Locking may also contain system access time overhead due to extra messages being sent across the network. Therefore, a data coherency solution other than locking is desirable.
Various embodiments of a data storage system for synchronizing mirrored and striped data writes are disclosed. In one embodiment, the data storage system includes a client computer system coupled to a first data storage device and a second data storage device and configured to transmit a first data write request. The first storage device is configured to transmit a sequence number to the client computer system in response to receiving the first data write request. The client computer system is further configured to transmit a second data write request including the sequence number to the second storage device.
In one particular implementation, the second data storage device may include a counter and is configured to compare a current counter value from the counter to the sequence number. If the current counter value is equal to the sequence number, the second storage device stores the data bytes corresponding to the second data write request and increments the counter. If the received sequence number is smaller, it is out of sequence and the second storage device may discard the data write request. If the received sequence number is larger, it is considered out of sequence and premature. The second storage device may store the data bytes, and store the data byte range and sequence number of the premature data write request in a record.
In other implementations, the second storage device may be configured to transmit the current counter value to the client computer system in response to storing the data bytes corresponding to the second data write request.