1. Field of the Invention
The present invention relates generally to the field of electronic storage media such as magnetic disk drives and more particularly to a method and apparatus for distributing volumes on multi-disk systems such as Hierarchical Storage Management (HSM) systems.
2. Background
In present-day HSM systems having multiple magnetic disk drives, the disks can be configured in a number of ways on the same system. In EMC Corporation's Symmetrix.TM. systems, for example, a user can request that logical volumes be configured as Redundant Arrays of Independent Disks (RAID) volumes of several different types, typically RAID level 1, which is usually understood to mean 100% mirroring or duplication of data, or RAID level 4 which uses parity. That is, some of the disks in an HSM system may be configured as RAID level 1 disks to contain mirror images, while others may be configured as RAID level 4, and other possibilities also exist. Such a system as shown in FIG. 3, is usually described as having logic elements and channels (system adapters SA) that communicate with the host computers (ATT-1, HP-HO, etc.) connected to the HSM system, and a cache memory and disk adapters (DA1-DA6), which are connected to disks over SCSI channels C, D, E, F in each disk adapter DA, to physical disks 00M1, 10M1 and so on. In laying out the disks on the back-end, a software program such as that used by EMC Corporation for its Symmetrix.TM. Systems is used to allocate logical volumes of various types to the actual physical disks. (Physical disks, such as physical disk 00M1 are usually larger than logical volumes. Two or more logical volumes, also known as hyper volumes can typically fit on one physical disk.)
When such an HSM system is used by one or more open system host computers A, B, C, or D, such as systems using the Unix operating system or its derivatives (ATT Unix, Sequent, HP, Silicon Graphics, etc, ) however, it is also important to distribute the logical volumes to the host computers on the front-end in such a way that the host computer not only gets the type of volumes (RAID levels 1, or 4, etc.) it desires, but also, if possible, in a way that is likely to produce reasonably good performance. Performance bottlenecks will occur if all the logical volumes distributed to that host physically reside on the same disc or SCSI channel, for example. To prevent such problems, the logical volumes for each host in an open systems environment need to be allocated to adapters, channels and disks in a way that is likely to reduce this occurrence. Presently, this is done in a very time and labor-intensive manual process for each HSM system.
The problem of distributing volumes to an HSM system's front-end (the host computers) in a timely manner is unique to open systems or those that follow file conventions similar to those of the principal open system, the Unix operating system and its derivatives and variations. IBM Corporation mainframe host computer models of the 1990's, using the ESCON architecture and running IBM's MVS operating system do not have the same problem, since both the host computer architecture and the operating system allow IBM mainframe host computers to share disks and manage channels in a way that open system operating systems do not. In IBM's non-open system, the architecture of the system chooses the available channels to provide hosts with balanced access when I/O requests are made. Hence it is not a configuration problem.
Most open system operating systems, such as Unix and its derivatives and variations, do not allow multiple hosts to share access to all disks and all channels in this way. Instead, logical access "routes" as it were, must be configured for each host computer before I/O requests are made. For an HSM system such as that shown in FIG. 3, if five open system host computers ATT-1, SEQU-8, Sequent-6, SequentA HP-HO, ATT-E, and SGI are sharing the system, then a human operator would manually analyze the logical volumes requested by each and attempt to assign them to the Ports (A, B, C, D) on each system adapter SA for each host in such a way as to provide an allocation that is balanced for performance purposes.
For example, if Host ATT-1 has SCSI channels C and D of disk adapter 1 (DA-1) assigned to it, and physical disks 00, 10, 20, 30, each of which is assigned to SCSI channel C, this is probably the least optimal distribution for performance. If I/O requests come in for all 4 volumes at the same time, the requests will have to be queued one behind the other, since SCSI channel C1 can only process one at a time. A better distribution for performance purposes would be to allocate physical volumes to each SCSI channel. In that case, if I/O requests came in at the same time for all 4 volumes, two would start processing immediately, (one per SCSI channel) while 2 would probably be queued. If a human operator only has to solve a distribution problem like this, it may only take him or her a few minutes. However, most users of HSM systems in an open systems environment have far more complex situations.
Typically, a number of host computers share the HSM system. Each of these, such as host computer ATT-1 on system adapter SA-3, might have 128 logical volumes (or more) allocated to it on the HSM system of FIG. 3. Of these, 16 logical volumes might be configured as RAID level 1 or mirrored volumes (in which case they will already have at least been allocated to different physical drives by the program that configures the back-end), 64 logical volumes might be configured as RAID level 5, 32 logical volumes might be configured as RAID level 4, and another 16 logical volumes might simply be reserved as spares for later configuration. Host B might have a similarly complex set of requirements.
The different RAID types required by a user impose another level of complexity on the problem, namely that, ideally, RAID groups should not be allocated to the same host.
Thus, in distributing logical volumes to a front end, the constraints imposed by disk format type, channels and host configurations must all be considered as the distribution is done. In distributing volumes to the front-end, the back-end configuration should not be changed. The way in which host systems are attached is also a given, as is the number of volumes of each type requested by each host. Any distribution of logical volumes has to be done with these and similar constraints in view. In addition, attempts to optimize for one host at a time, often lead to poor overall results. A distribution that would provide good performance for a first host, may not leave as many desirable options for the next host. As the number of hosts sharing the HSM system increases, the last host may be left with a configuration that is worse than random chance resulting from the original back-end distribution.
Consequently, a human operator attempting to distribute a complex set of logical volumes for all the hosts shown in FIG. 3 might spend a minimum of five or more hours or frequently even days attempting to create a distribution of the logical volumes across the host computers' SCSI channels that is reasonably performance balanced for the various open system hosts.
It is an object of this invention to automate the distribution of logical volumes to an HSM system front-end.
It is another object of the present invention to distribute logical volumes to an HSM system front-end in a way that is likely to provide reasonable performance for each of the computers sharing an HSM system.