1. Field of the Invention
This invention generally relates to the management of resources in a data processing system and more particularly to a tool for use in the management of a disk array storage device.
2. Description of Related Art
Many data processing systems now incorporate disk array storage devices. Each of these devices comprises a plurality of physical disks arranged into logical volumes. Data on these devices is accessible through various control input/output programs in response to commands, particularly reading and writing commands from one or more host processors. A Symmetrix 5500 series integrated cached disk array that is commercially available from the assignee of this invention is one example of such a disk array storage device. This particular array comprises multiple physical disk storage devices or physical disk drives with the capability of storing terabytes of data. The management of such resources becomes very important because the ineffective utilization of the capabilities of such an array can affect overall data processing system performance significantly.
Generally a system administrator will, upon initialization of such a direct access storage device, determine certain characteristics of the data sets to be stored. These characteristics include the data set size, and volume names and, in some systems, the correspondence between a logical volume and a particular host processor in a multiple host processor system. The system administrator uses this information to configure the disk array storage device by distributing various data sets across different physical disk devices accordingly with an expectation of avoiding concurrent use of a physical device by multiple applications. Often times allocations based upon this limited information are or become inappropriate. When this occurs, the original configuration can degrade overall data processing system performance dramatically.
One approach to overcoming this problem involves an analysis of the operation of the disk array storage device prior to loading a particular data set and then determining an appropriate location for that data set. For example, U.S. Pat. No. 4,633,387 to Hartung et al. discloses load balancing in a multi-unit data processing system in which a host operates with multiple disk storage units through plural storage directors. In accordance with this approach a least busy storage director requests work to be done from a busier storage director. The busier storage director, as a work sending unit, supplies work to the work requesting, or least busy, storage director.
U.S. Pat. No. 5,239,649 to McBride et al. discloses a system for balancing the load on channel paths during long running applications. In accordance with the load balancing scheme, a selection of volumes is first made from those having an affinity to the calling host. The load across the respective connected channel paths is also calculated. The calculation is weighted to account for different magnitudes of load resulting from different applications and to prefer the selection of volumes connected to the fewest unused channel paths. An optimal volume is selected as the next volume to be processed. The monitored load on each channel path is then updated to include the load associated with the newly selected volume, assuming that the load associated with processing the volume is distributed evenly across the respective connected channel paths. The selection of the following volume is then based on the updated load information. The method continues quickly during subsequent selection of the remaining volumes for processing.
In still another approach, U.S. Pat. No. 3,702,006 to Page discloses load balancing in a data processing system capable of multi-tasking. A count is made of the number of times each I/O device is accessed by each task over a time interval between successive allocation routines. During each allocation, an analysis uses the count and time interval to estimate the utilization of each device due to the current tasks. An estimate is also made with the anticipated utilization due to the task undergoing allocation. The estimated current and anticipated utilization serve as a basis for the allocation of data sets to the least utilized I/O devices.
Yet another load balancing approach involves a division of reading operations among different physical disk drives that are redundant. Redundancy has become a major factor in the implementation of various storage systems and must also be considered in configuring a storage system. U.S. Pat. No. 5,819,310 to Vishlitzky, et al. discloses such a redundant storage system with a disclosed disk array storage device that includes two device controllers and related physical disk drives for storing mirrored data. Each of the physical disk drives is divided into logical volumes. Each device controller can effect different reading processes and includes a correspondence table that establishes the reading process to be used in retrieving data from the corresponding physical disk drive. Each disk controller responds to a read command that identifies a logical volume by using the correspondence table to select the appropriate reading process and by transferring data from the appropriate physical disk drive containing the designated logical volume.
Consequently, when this mirroring system is implemented, reading operations involving a single logical volume do not necessarily occur from a single physical disk drive. Rather read commands to different portions of a particular logical volume may be directed to any one of the mirrors for reading from preselected tracks in the logical volume. Allowing such operations can provide limited load balancing and can reduce seek times.
Other redundancy and striping techniques may spread the load over multiple physical drives by dividing a logical volume into sub-volumes that are stored on individual physical disk drives in blocks of contiguous storage locations. However, if the physical disk drives have multiple logical volumes, sub-volumes or other forms of blocks of contiguous storage locations, the net effect may not balance the load with respect to the totality of the physical disk drives. Thus, none of the foregoing references discloses or suggests a method for providing a dynamic reallocation of physical address space based upon actual usage.
Recently more rigorous analyses have been implemented to provide dynamic reallocation based upon actual usage. U.S. Pat. No. 6,189,071 granted Feb. 13, 2001 (application Ser. No. 09/143,683 filed, Aug. 28, 1998) discloses one such analysis that includes the step of providing an approximation of disk seek times. Generally these approaches determine seek distances and convert the seek distances into time. In more specific terms, this approach uses a statistical analysis by which actual disk accesses are weighted and combined to produce an estimated seek activity. Then this estimate is converted to a seek time by combination with a value, ti,j, that is an approximation of the seek time between two logical volumes i and j. However, in some applications it may be desirable to obtain more accurate seek times to use in selecting exchangeable logical volumes that, in turn, can optimize the performance of a disk array storage device.