1. Field of the Invention
This invention relates generally to storage management, and more particularly to detecting and repairing inconsistencies among mirrored data images in distributed shared storage environments.
2. Description of the Related Art
Modern distributed shared storage environments may include multiple storage objects connected via an interconnection network. The interconnection network provides the infrastructure to connect the various elements of a distributed shared storage environment. Within the storage environment, file system abstractions may be built on top of logical volumes that may be distributed across multiple storage devices. As the number of logical volumes and file system abstractions grows, the complexity of the entire storage environment grows dramatically.
In order to limit bottlenecking and the resulting restricted data throughput, distributed shared storage environments may separate the actual storage of data from the management of that data. Storage architectures that employ this technique may be referred to as out-of-band or asymmetric systems. A metadata server (MDS) generally supplies data management and control functions including, among others, file system mapping, mirror synchronization, client authentication and access privileges. A MDS can provide this metadata to other devices, processes, and applications. The data itself can be stored on various storage devices attached to the network, but not necessarily connected directly to the MDS. Such storage devices provide data storage functions without having to worry about the metadata and file system management.
Applications, or clients, initially contact the MDS to request access to a specific file or dataset. The MDS may, after authenticating the client node and applying whatever data access policies are relevant, provide the requesting client node with information (metadata) about what storage device contains that particular dataset and with an access token to present to the storage device. Client nodes may then communicate directly with the storage device, presenting access tokens when reading or writing data. The access token, or capability, generally describes the access rights of the client, and may, through the use of digital signatures, provide proof that the access token was generated by the MDS and has not been modified.
Separating data from its associated metadata allows the actual data traffic to be routed to storage devices and therefore may prevent the MDS from becoming a bottleneck and limiting the total amount of data throughput. This architecture may also allow the MDS to be optimized for metadata lookups that usually involve smaller reads and writes, while allowing the storage devices themselves to be optimized for larger transfers of data.
One proposed type of storage device for use in shared storage environments is the object-based storage device (OBSD). OBSDs may provide clients with access to objects, frequently called user objects, comprising a logical collection of bytes on the storage device. User objects are of variable size and provide a storage abstraction that can represent application specific structures such as files, database tables, images or other media.
Systems frequently mirror file images to ensure data integrity and consistency. Other uses for data mirroring may include backing up data, distributed load sharing, disaster recovery, minimizing the damage from Trojan horses and viruses, or point-in-time analysis and reporting. A traditional mirror synchronization strategy may involve a single host device storing copies of data until all mirrors have confirmed that the data has been committed. Another traditional strategy may involve maintaining a bitmap including a logical representation of every data block in a mirrored device, and tagging the logical representation as “dirty” for each block that is written. To compare mirrors, the bitmaps from the different mirrored devices are compared to determine if any discrepancies are present. Yet another possible strategy may involve the individual mirrored devices communicating with each other to compare and copy data as needed to ensure data consistency.