Embodiments of the disclosed subject matter generally relate to the field of storage networks and, more particularly, to storage controller replacement within cross-cluster redundancy configurations.
Computer clusters implement a form of distributed computing. A computer cluster consists of a set of nodes that are configured and communicatively coupled in a cooperative manner to share resources and in some respects operate as a single system. The components of a cluster typically include multiple server nodes and one or more cluster management nodes interconnected by a local area network (LAN), with each node running its own instance of a common operating system. Clusters are usually deployed to improve performance and availability over that of centralized computing, while typically being more cost-effective than single computers of comparable speed or availability.
A storage cluster is a type of networked computer cluster generally characterized as including multiple interconnected storage nodes. Each storage node consists of a controller coupled to a mass storage unit such as an array of storage disks or solid state drives (SSDs) on which data, sometimes referred to as “backend data,” is stored. The storage node controller performs server-like functions for optimizing access to and usage of storage resources including the stored data. The mass storage unit may be a Redundant Array of Independent Disks (RAID) that provides long-term, non-volatile data storage.
Ensuring continuous, uninterrupted access to backend data is a vital function of most storage clusters. So-called High Availability (HA) storage is often used to ensure uninterrupted access to backend data in the event of an interruption to a given storage node's operation. The interruption may be due to a hardware or software failure, or due to maintenance (e.g., replacement) of a storage node. An HA configuration may define a cluster (an HA cluster) or may be a cluster configuration feature such as one or more HA pairs within an otherwise defined cluster. In either case, the basic HA storage configuration consists of at least two somewhat independent storage nodes that perform mutual backup roles under the management of system control code and related configuration settings. Simply, when one of the nodes fails, the other immediately assumes control of its HA partner node's operation and storage.
The increasing scale of distributed data storage has raised the need to expand protection of stored data and uninterrupted access thereto beyond intra-cluster backup redundancy. This need is being addressed by the growing prevalence of data redundancy across clusters. Storage redundancy across clusters, such as within data centers which may be physically separated by tens or even hundreds of kilometers, uses data replication such as by data mirroring. In this manner, the data and uninterrupted access thereto are protected against site-wide failures that may result, for example, from loss of power.