This invention pertains generally to computer systems and networks having two or more host computers and at least one shared data storage device. More particularly, the invention pertains to structure and method for controlling access to shared storage in computer systems and networks having a plurality of host computers that may create data integrity issues for the shared data storage, particularly in a Storage Area Network (SAN).
Conventional operating systems may typically assume that any storage volume or device is xe2x80x9cprivatexe2x80x9d and not shared among different host computers. In a distributed computing system, such as a network server system, a disk drive, a storage volume, a logical volume, or other storage device may be shared and represent common storage. When a controller responsible for controlling read, write, or other access to the storage device, such as a hard disk array controller (for example a RAID controller) is attached to the plurality of host computers, such as through a SCSI Bus, Fibre Channel Loop, or other storage device interface, problems may arise because one or more of these plurality of host computers may overwrite or otherwise corrupt information needed for the correct operation of another different host computer system.
This problem is particular prevalent when the plurality of host computers is formed from a heterogeneous mixture or collection of different host computers having different operating systems, but this problem also exists for homogeneous mixtures or collections of host computer systems.
In one exemplary situation, one type of operating system (such as for example, the Unix operating system of a computer made by Sun Microsystems) requires special information at specific addresses on the storage device, while a different type of operating system (such as for example, a computer utilizing the Windows NT operating system made by Microsoft) may require that any attached storage have special identifying information written to the same or an overlapping address on the same storage device. The second type (Windows NT) will overwrite the information needed by the first type (Unix) of computer, and from the perspective of the Unix computer, the storage will be corrupt and unusable.
A problem situation can also frequently arise when the host system 101 has similar or the same hardware and the same operating system, that is, for homogeneous combinations of host systems. For example, the Microsoft NT 4.0 operating system could represent such as problem on either homogeneous or heterogeneous hardware. Each host computer (computer A and computer B) will write a special identifying xe2x80x9ctagxe2x80x9d to each disk of the shared storage array. Whichever computer is the last to write to the disk or shared storage array will be the xe2x80x9cWinnerxe2x80x9d as its xe2x80x9ctagxe2x80x9d or xe2x80x9csignaturexe2x80x9d will remain intact after the last write operation, and the other hosts will act, and be treated like, they have never seen the storage array or members of the array before. Also, if computer A formats a disk, then another computer B also subsequently formats the same disk, computer A""s format and data is now corrupt. This lafter scenario is independent of similarities or disparities in the host""s operating systems.
Having mentioned the Microsoft Windows NT operating system, we note that while exemplary embodiments of the invention make reference to Windows NT, Unix, and Novel by way of example the invention is not limited to the Windows NT, Unix, Novel, or to any other particular operating system environment, but rather is applicable to a broad range of computer systems, server systems, information storage and retrieval systems, and the like, and to various operating systems, and more generally is applicable to any computer and/or information storage/retrieval system.
With respect to FIG. 1, we now describe an exemplary distributed computing system 100 having first, second, and third host computers 101 (101-1, 101-2, 101-3) coupled to an array controller 104 which in turn is coupled to a storage subsystem 108 formed from one or more logical volumes, here shown as an array of logical disk drive storage volumes (108-1, 108-2, 108-3, . . . , 108-N). In general, these logical volumes 108 may correspond to physical hard disk drive devices, or to groups of such physical hard disk drive devices. In this embodiment, the three host computers 101-1, 101-2, and 101-3 are coupled to array controller 104 via a Fibre Channel Loop 120 communications channel, and the logical volumes 108 of the storage subsystem are coupled to the array controller 104 via an appropriate channel 122, such as for example either a Fibre Channel Loop communications channel or a parallel SCSI communications channel. For the Fibre Channel Loop, SCSI protocols are frequently used in addition to the Fibre Channel physical layer and related protocols and standards. Fibre Channel Loop 120 is advantageous for interconnections of the host computers because of the flexibility and extensibility of this type interface to a large number of host computers and also, with respect to the inventive structure and method, for the existing support of World Wide Number (WWN) identification.
In computing system 100, array controller 104 divides the storage into a number of logical volumes 108. These volumes are accessed through a Logical Unit Number (LUN) addressing scheme as is common in SCSI protocol based storage systems, including SCSI protocol based Fibre Channel Loop physical layer configurations. The term LUN refers to a logical unit or logical volume, or in the context of a SCSI protocol based device or system, to a SCSI logical unit or SCSI logical volume. Those workers having ordinary skill in the art will appreciate that the number of physical disk drives may be the same as, or different from, the number of logical drives or logical volumes; however, for the sake of simplicity and clarity of description here we use these terms interchangeably, focusing primarily on logical volumes as compared to physical disk drives. The manner in which physical devices are generically assigned, grouped, or mapped to logical volumes is known in the art and not described further here.
Each of the host computers 101 of the system 100 has an operating system as is known in the art. The operating system, such as Windows NT, on any single host will attempt to mount all of the logical storage volumes 108 that it detects are physically connected when host 101 boots, such as during host system power-up or reset. As a result, any data on any one of the logical volumes 108 can be accessed by the operating system. In situations where new disk storage (additional logical volume) is added to system 100 so that it is available to a host or when a user attempts to configure the storage already available to the host, unless constrained, the operating system (including the Windows NT 4.0 operating system) will automatically write an identifying signature to these new storage device(s).
This identifying signature typically includes information that allows the particular operating system (such as Windows NT) to uniquely identify the storage device(s). The format and content of such signatures are not important to the invention except that they exist, are usually established by the vendor of the particular operating system (e.g. Microsoft Corporation for Windows NT), and are known in the art. Hence the specific content and location of these signatures are not described further here.
Usually, a particular area on a storage device is reserved for the signature, but the implementation is specific to particular operating systems and installations. Hence, even for hosts having common operating systems, different host installations may cause problems. For example, the size of the storage, the operating system, the version and/or revision of the operating system, and the like, may differ from host to host. Significantly, one operating system may place important data in an area normally reserved for other reasons in a different operating system or in a different installation of that same operating system. Therefore, although an area may be reserved for Windows NT, it is unfortunately problematic that another Windows NT system will write a separate signature in the same area. For a given host hardware and operating system installation, the location of the signature is usually fixed. In general, the operating system vendor exercises considerable control as to the location at which the signature is written but since all of these operating systems assume they solely own the storage, there""s no general way to assure that data won""t get overwritten. This compounds the problem with traditional approaches and suggests that a more general solution that does not rely on luck to preserve data integrity is called for.
As another host system, such as a system incorporating a Unix operating system, may store data in the storage location to which the Windows NT xe2x80x9csignaturexe2x80x9d was written earlier, the subsequently written Unix signature will corrupt the earlier signature and other data. Furthermore, the signature itself may subsequently be overwritten by data from another host computer during a normal write operation. Overwriting can happen at any time, but is most likely during a format or initialization process. In either event, it is clear that the information stored on the physical device and or logical volume will be corrupted.
It is therefore problematic that in traditional systems 100 each host computer 101 has complete access to all of the Logical Volumes 108, and no structure or procedure is available for restricting access to a particular logical volume by a particular host or group of hosts.
Therefore there exists a need for structure and method that resolves this shared access problem by efficiently testing and validating authorization to access a storage volume, logical volume, or storage device on the array controller to a specific set of host computers and limiting access only to authorized hosts, so that neither critical information nor data generally will be overwritten or otherwise corrupted.
The invention provides structure and method for controlling access to a shared storage device, such as a disk drive storage array, in computer systems and networks having a plurality of host computers. In one aspect, the invention provides such a method for controlling access to a hardware device in a computer system having a plurality of computers and at least one hardware device connected to the plurality of computers. The method comprises associating a locally unique identifier with each the plurality of computers, defining a data structure in a memory identifying which particular ones of the computers based on the locally unique identifier may be granted access to the device; and querying the data structure to determine if a requesting one of the computers should be granted access to the hardware device.
In one embodiment, the procedure for defining the data structure in memory includes defining a host computer ID map data structure in the memory; defining a port mapping table data structure comprising a plurality of port mapping table entries in the memory; defining a host identifier list data structure in the memory; defining a volume permission table data structure in the memory; and defining a volume number table data structure in the memory. In one particular embodiment, the memory is a memory of a memory controller controlling the hardware device, and the hardware device is a logical volume of a storage subsystem.
The invention also provides an inventive controller structure, and a computer program product implementing the inventive method.