This invention relates generally to cluster data systems, and more particularly to new and improved cluster architectures, systems and methods for data storage and data protection which address problems associated with known traditional cluster architectures, systems and methods.
Today data protection systems are moving to a scale-out cluster system architecture from a single controller model because larger storage capacities are required and single controller system architectures are unable to support the larger capacities required. Known traditional cluster systems are usually one-level flat systems in which all cluster nodes are in same logical level, and the cluster membership is just a single list of all nodes in the cluster. The nodes coordinate together and cooperate as a logical unit to provide cluster capabilities and features (such as availability, scalability, fault tolerance, redundancy, consistency, etc.) for the applications and services. The common way for traditional cluster systems to scale is to add more nodes into the cluster. However, most traditional cluster systems only support a limited number of members (hosts/nodes) due to their architectural design and other fundamental limitations, so that expanding the number of nodes in such systems is challenging. One such challenge is because of the overhead required to support large numbers of nodes, including the overheads associated with providing heartbeats between nodes, ordering messages and maintaining a consistent state across nodes. The more nodes, the more overhead is required to maintain node membership, particularly in virtual machine clusters.
Another issue concerns network partitions. A network partition is where a cluster is divided into two or more partitions that cannot communicate with each another because of network problems. As a result, a portion of the cluster's processing and services become unavailable. A traditional cluster system does not handle network partitions well. The larger a cluster becomes, the higher is the possibility of a network partition problem occurring.
Additionally, data protection systems typically have to work with multiple different types of data having storage requirements based upon the data's required availability. For instance, so-called “hot data” is data such as newly backed up data that is very likely to be accessed soon, and requires high sequential throughput (I/O) and rapid random access. This type of data may be referred to as “active tier” data. Other so-called “cold data” includes data that must be retained for a long period of time and is infrequently accessed. It is referred to as “archive tier” data. Very cold data that is retained substantially permanently may be retained in cloud storage and referred to as “cloud tier” data. Recently another data tier used for caches and data requiring fast access and a random I/O workload has emerged. It is referred to as “SSD tier” data because it is stored in fast solid state memory. Because of their different availability requirements, the different types of data require different types of cluster nodes having different types of hardware and software. This necessitates heterogeneous nodes and a heterogeneous cluster architecture. Today's cluster systems are not optimized to support either heterogeneous systems or multiple data tiers for different data types, nor are known systems and architectures optimized to handle node failures. Moreover, managing such a heterogeneous architecture to ensure the required data availability, accessibility and protection poses additional challenges which known cluster systems are unable to meet.
It is desirable to provide new and improved heterogeneous data protection architectures, systems and methods for supporting multiple tiers of data types having different availability, retention and protection requirements that address the foregoing and other problems with known storage architectures, systems and methods, and it is to these ends that the invention is directed.