1. Field of the Invention
The present invention generally relates to data storage systems and methods, and, more particularly, to a methodology for providing fault tolerance in a distributed object based data storage network using a hierarchy of network entities for fail-over decision making and execution.
2. Description of Related Art
With increasing reliance on electronic means of data communication, different models to efficiently and economically store a large amount of data have been proposed. A data storage mechanism requires not only a sufficient amount of physical disk space to store data, but various levels of fault tolerance or redundancy (depending on how critical the data is) to preserve data integrity in the event of one or more disk failures. The availability of fault-tolerance is almost mandatory in modern high-end data storage systems. One group of schemes for fault tolerant data storage includes the well-known RAID (Redundant Array of Independent Disks) levels or configurations. A number of RAID levels (e.g., RAID-0, RAID-1, RAID-3, RAID-4, RAID-5, etc.) are designed to provide fault tolerance and redundancy for different data storage applications. A data file in a RAID environment may be stored in any one of the RAID configurations depending on how critical the content of the data file is vis-á-vis how much physical disk space is affordable to provide redundancy or backup in the event of a disk failure.
Another method of fault tolerance in existing distributed data storage systems is the use of partitioning to divide the storage network into a number of “fault domains.” A fault domain is a set of network entities (e.g., storage servers, storage disks, client machines, etc.) that can be affected by a failure or involved in a recovery from the failure. In a partitioned network, a set of storage disks residing in a partition are assigned to a particular server and only that server is involved in later fault recovery when one of its assigned disks fails. Because of the rigid partitioning, a server “owns” certain devices (e.g., storage disks) and makes the sole decision as to how to handle device failures within the devices in the partition. Even when the server has a backup for failure handling, the server and its backup still make all the decisions as to how to handle fail-over between that server and other servers in other partitions in the network.
In a distributed data storage system, fault domains may be very fluid because of the distributed nature of the storage system. If load balancing is employed in such a storage system across all system components, then a fault domain may end up including the entire system. However, due to network latency and communication overhead, it may not be feasible to have every network entity or system component form a single cluster. It is therefore desirable to provide fault tolerance at all levels in such a distributed data storage system without rigidly partitioning the system into fault domains. It is further desirable to handle both single unit failures and network partitions in a unified way so as to maintain an integrated storage environment, without creating a single cluster out of the storage domain.