1. Technical Field
The present disclosure relates to storage systems and, more specifically, to high performance and availability of data in a cluster of storage systems.
2. Background Information
A storage system typically includes one or more storage devices, such as disks, into which information (i.e. data) may be entered, and from which data may be obtained, as desired. The storage system (i.e., node) may logically organize the data stored on the devices as storage containers, such as files, logical units (luns), and/or aggregates having one or more volumes that hold files and/or luns. To improve the performance and availability of the data contained in the storage containers, a plurality of nodes may be interconnected as a cluster configured to provide storage service relating to the organization of the storage containers and with the property that when one node fails another node may service data access requests, i.e., operations, directed to the failed node's storage containers.
In such a cluster, two nodes may be interconnected as a high availability (HA) pair configured to operate as “shared nothing” until one of the nodes fails. That is, each node (i.e., the owner node) may service the operations directed to its storage containers and only services the operations directed to the storage containers of another node (i.e., the local node) after a failure of that node, which triggers a takeover (TO) sequence on the surviving node (i.e., the partner node). Data availability is typically guaranteed by mirroring the operations serviced and logged, but not yet committed (i.e., persistently stored) to the disks at the local node to the HA partner node. Such mirroring typically occurs over a high speed connection between non-volatile random access memory (NVRAM) hardware on both nodes.
High performance is also typically guaranteed in such a cluster by providing an alternate channel in which the partner node acts a proxy, redirecting a data access request issued by a client (e.g., an application) to the owner node of the storage container (e.g., aggregate) to which the request is directed. To redirect the request, the partner node typically examines the final recipient of the request (e.g., using a global data structure) and proxies the request to the appropriate owner node. The owner node subsequently processes the request and sends a response to the partner (redirector) node, which then relays the response to the requesting application.
To initiate the TO sequence, the partner node assumes control over the disks of the local node, mounts the storage containers (e.g., volumes) of the local node and replays the logged operations mirrored from the local node to the NVRAM of the partner node to essentially take over the storage service provided by the local node. Such replay includes persistent storage of the logged (serviced) operations to the disks; typically, replay of the logged operations is performed sequentially, i.e., one-by-one, by the partner node without any logical interpretation of the operations logged in its NVRAM. As a result, a substantial amount of the time associated with the TO sequence is consumed by NVRAM replay of the logged operations by the file system.