1. Field of the Invention
Embodiments of the present invention generally relate to data storage systems and, more particularly, to controlling access to a storage area network in a distributed computing environment.
2. Description of the Related Art
Computer networks have multiple hosts that share storage on a storage area network (SAN). A SAN provides multiple benefits, allowing for remote data backup and disaster recovery over a computer network, centralized administration of the data, and high availability of the data to the computer network. Sharing storage simplifies storage administration and adds flexibility since cables and storage devices do not have to be physically moved to move storage from one host to another. Adding additional storage capacity to the SAN benefits each host on the computer network. Controlling access to the SAN is important because a host has the ability to overwrite or corrupt data that has been stored on the SAN by another host.
Dynamic multipathing (DMP) is a method of providing two or more hardware paths to a single storage unit such as a storage unit or storage array. For example, the physical hardware can have at least two paths, such as c1t1d0 and c2t1d0, directing input/output (I/O) to the same storage unit. A volume manager such as VERITAS VOLUME MANAGER available from Veritas Corporation of Mountain View, Calif. can be used to select the paths. For example, the volume manager arbitrarily selects one of the two storage units and creates a single device entry, then transfers data across both paths to the I/O of the computers. DMP is enabled by default; the volume manager detects multiple paths with universal world wide device identifiers and manages multipath targets, such as disk arrays, which define policies for using more than one path. DMP provides greater reliability to a path failover mechanism. In the event of a loss of one connection to a storage unit, the system continues to access the critical data over the other connections until a failed path is replaced. DMP provides greater I/O throughput by balancing the I/O load uniformly across multiple I/O paths to the storage unit.
DMP is a layer in the UNIX storage I/O software stack. While different platform implementations differ in detail, UNIX I/O software stacks share a common overall structure, simply because all perform the same basic functions to provide I/O services to a computer. In the conventional UNIX I/O software stack, the DMP management layer resides above the operating system SCSI driver layer.
This approach does not meet all the needs introduced by today's fiber channel storage networks. For example, any host that is able to access a storage unit, whether by design or by error, is able to write data to the storage unit using the operating system SCSI layer drivers. To prevent hosts from making I/O requests to the storage units and logical unit numbers (LUNs) that do not belong to them, administrators must take some external action unrelated to the volume manager, such as LUN masking or zoning. Moreover, even with a host's own storage units and LUNs, there is the possibility of erroneously overwriting private or public regions because of human error or as a result of sabotage.
In a distributed volume manager, additional security issues may arise. A distributed volume manager provides a data center wide shared disk pool with volumes from the same disk group shared among multiple hosts. A user can implement a storage area network (SAN) wide disk group and share volumes from the group among multiple hosts. This allows users to provide the right amount of storage to each server without regard for boundaries imposed by physical LUNs. With a SAN volume manager, a number of LUNs could be sliced into multiple volumes to be exported to multiple hosts.
Allocating part of LUNs to volumes belonging to different hosts compromises security, because every host that has a volume slice on a shared LUN has access to the entire LUN through the operating system SCSI layer. As a result, any host could destroy or impair data on a shared LUN, either by accident or maliciously. An error or intrusion on one host can corrupt the data of every host whose volume shares the LUN affected by the error or intrusion. There is potential for a data center wide breakdown in service as well as unrecoverable data corruption.
Multiple connections to a LUN are often implemented as an active/passive connection in high availability configurations of a computer network. In an active/passive connection, there are two connections to a LUN, but only one that is active. The passive channel is only used if a failure occurs on the active channel. A problem encountered with this approach is that certain operating system operations on multipath devices can cause failover of active/passive disk arrays LUNs, resulting in small but noticeable service interruptions from the host issuing the command. This interruption will be noticed in all LUNs that are visible to the host where the command was executed. The interruption generates even bigger problems in environments where access to the same LUNs is shared between multiple hosts. In these situations, all hosts sharing affected LUNs will notice an interruption in service.
Disk and LUN-level security can be implemented using SCSI-3 persistent group reservations (PGR), but such a solution is necessarily incomplete, and moreover, does not solve the problem of I/O requests made directly to a storage unit and LUNs by operating system commands and utilities. SCSI-3 reservations apply to entire storage units and LUNs, so all the hosts using volume share must register their PGR keys with that LUN. Registration prevents non-registered hosts from writing to a LUN or storage unit, but any registered host has access to the entire device. Moreover, the SCSI-3 standards specify a maximum of 32 keys per LUN. This would limit storage unit or LUN sharing to a maximum of 32 nodes. In environments where the expectation is a common pool of storage for an entire data center, it will almost certainly become a severe constraint.
Accordingly, a need exists for a method and apparatus for controlling access to a storage area network in such a manner that a host cannot overwrite or corrupt data on a volume or LUN controlled by another host.