A server cluster is generally a group of servers (nodes) arranged such that if any server fails, the other servers of the cluster can transparently take over the work of the failed server, that is, restart its applications and thereby continue serving clients without significant interruption. This operation is generally referred to as failover, or failover clustering.
At present, failover clustering uses a “shared-nothing” storage model, in which each storage unit (e.g., a disk or part thereof) is owned by a single node. Only that node can perform input/output (I/O) to that specific storage unit, referred to by a LUN (Logical Unit Number). A LUN exposes one or more volumes.
In this model, the application and disk resources are part of a common group (a unit of failover corresponding to a LUN) with an explicit dependency between the application-resource and the disk-resource, in order to guarantee that disks are brought online before the applications start, and are put offline after the applications exit. As a result, clustered applications such as Microsoft® SQL Server, Microsoft® Exchange Server, and Microsoft® File Services are constrained to this I/O model, whereby any of the applications that need access to the same disk need to be run on the same cluster node.
However, an application's failover and restart operation is limited by the time taken for volume dismount and remount. Moreover the shared-nothing model may lead to a high management cost of the attached storage, because of the relatively large number of LUNs that are required in practical usage scenarios. For example, to have somewhat more granular failover when files stored are stored on a SAN (storage area network), numerous LUNs need to be carved out on the SAN, because of the need to failover all of the all the applications that depend on the same LUN at the same time; applications that reside on the same LUN cannot failover to different nodes, because only one node has access to the LUN at a given time.