1. Field of the Invention
This invention generally relates to the management of resources in a data processing system and more particularly to the management of a disk array storage device.
2. Description of Related Art
Many data processing systems now incorporate disk array storage devices. Each of these devices comprises a plurality of physical disks arranged into logical volumes. Data on these devices is accessible through various control input/output programs in response to commands, particularly reading and writing commands from one or more host processors. A Symmetrix 5500 series integrated cached disk array that is commercially available from the assignee of this invention is one example of such a disk array storage device. This particular array comprises multiple physical disk storage devices or drives with the capability of storing large amounts of data up to several terabytes or more. The management of such resources becomes very important because the ineffective utilization of the capabilities of such an array can affect overall data processing system performance significantly.
Generally a system administrator will, upon initialization of a direct access storage device, determine certain characteristics of the data sets to be stored. These characteristics include the data set size, and volume names and, in some systems, the correspondence between a logical volume and a particular host processor in a multiple host processor system. Then the system administrator uses this information to configure the disk array storage device by distributing various data sets across different physical devices accordingly with an expectation of avoiding concurrent use of a physical device by multiple applications. Often times allocations based upon this limited information are or become inappropriate. When this occurs, the original configuration can degrade overall data processing system performance dramatically.
One approach to overcoming this problem has been to propose an analysis of the operation of the disk array storage device prior to loading a particular data set and then determining an appropriate location for that data set. For example, U.S. Pat. No. 4,633,387 to Hartung et al. discloses load balancing in a multi-unit data processing system in which a host operates with multiple disk storage units through plural storage directors. In accordance with this approach a least busy storage director requests work to be done from a busier storage director. The busier storage director, as a work sending unit, supplies work to the work requesting, or least busy, storage director.
U.S. Letters Pat. No. 5,239,649 to McBride et al. discloses a system for balancing the load on channel paths during long running applications. In accordance with the load balancing scheme, a selection of volumes is first made from those having affinity to the calling host. The load across the respective connected channel paths is also calculated. The calculation is weighted to account for different magnitudes of load resulting from different applications and to prefer the selection of volumes connected to the fewest unused channel paths. An optimal volume is selected as the next volume to be processed. The monitored load on each channel path is then updated to include the load associated with the newly selected volume, assuming that the load associated with processing the volume is distributed evenly across the respective connected channel paths. The selection of the following volume is then based on the updated load information. The method continues quickly during subsequent selection of the remaining volumes for processing.
In another approach, U.S. Letters Pat. No. 3,702,006 to Page discloses load balancing in a data processing system capable of multi-tasking. A count is made of the number of times each I/O device is accessed by each task over a time interval between successive allocation routines. During each allocation, an analysis is made using the count and time interval to estimate the utilization of each device due to the current tasks. An estimate is also made with the anticipated utilization due to the task undergoing allocation. The estimated current and anticipated utilization are then considered and used as a basis for attempting to allocate the data sets to the least utilized I/O devices so as to achieve balanced I/O activity.
Each of the foregoing references discloses a system in which load balancing is achieved by selecting a specific location for an individual data set based upon express or inferred knowledge about the data set. An individual data set remains on a given physical disk unless manually reconfigured. None of these systems suggests the implementation of load balancing by the dynamic reallocation or configuration of existing data sets within the disk array storage system.
Another load balancing approach involves a division of reading operations among different physical disk drives that are redundant. Redundancy has become a major factor in the implementation of various storage systems that must also be considered in configuring a storage system. U.S. Letters Pat. No. 5,819,310 granted Oct. 6, 1998 discloses such a redundant storage system with a disclosed disk array storage device that includes two device controllers and related disk drives for storing mirrored data. Each of the disk drives is divided into logical volumes. Each device controller can effect different reading processes and includes a correspondence table that establishes the reading process to be used in retrieving data from the corresponding disk drive. Each disk controller responds to a read command that identifies the logical volume by using the correspondence table to select the appropriate reading process and by transferring data from the appropriate physical storage volume containing the designated logical volume.
Consequently, when this mirroring system is implemented, reading operations involving a single logical volume do not necessarily occur from a single physical device. Rather read commands to different portions of a particular logical volume may be directed to any one of the mirrors for reading from preselected tracks in the logical volume. Allowing such operations can provide limited load balancing and can reduce seek times.
Other redundancy techniques and striping techniques can tend to spread the load over multiple physical drives by dividing a logical volume into sub-volumes that are stored on individual physical drives in blocks of contiguous storage locations. However, if the physical drives have multiple logical volumes, sub-volumes or other forms of blocks of contiguous storage locations, the net effect may not balance the load with respect to the totality of the physical disk drives. Thus, none of the foregoing references discloses or suggests a method for providing a dynamic reallocation of physical address space based upon actual usage.
Therefore it is an object of this invention to enable a dynamic reallocation of data in a plurality of physical disk storage devices to reduce any imbalance of load requirements on each physical disk storage.
Another object of this invention is to determine the relative utilization of physical disk storage devices to reduce imbalances in the utilization.
In accordance with this invention it is possible to balance loads on physical disk storage devices in a disk array storage device wherein at least two physical disk storage devices store data in a plurality of logical volumes and wherein each physical disk storage device responds to a data transfer request to read or write data. Balancing is achieved by generating operational data including the number of accesses to each logical volume on predetermined ones of the physical disk storage devices in response to data transfer requests. The method converts the operational data into disk utilization values for each predetermined physical disk storage device and for each logical volume in the predetermined physical disk storage devices. Analyzing the disk utilization values leads to the selection of a pair of logical volumes that, if exchanged, would improve load balance for the predetermined physical storage devices. Once the method makes the identification, it exchanges the selected logical volumes.
In accordance with another aspect of this invention, it is possible to balance loads on physical disk storage devices in a disk array storage device wherein at least two physical disk storage devices store data in a plurality of logical volumes and each physical disk storage device responds to a data transfer request to read or write data. The method includes a step defining the length of an analysis interval and included analysis subintervals and generating, for each subinterval, operational data including the number of accesses to each logical volume on predetermined ones of the physical disk storage devices in response to data transfer requests. The method converts the operational data obtained during each subinterval into disk utilization values for each predetermined physical disk storage device and each logical volume in the predetermined physical disk storage devices. An analysis of these disk utilization values enables the selection of a pair of logical volumes that, if exchanged, would improve load balance for the predetermined physical storage devices. The method ends with the exchange of the selected logical volumes.
In accordance with still another aspect of this invention it is possible to balance loads on physical disk storage devices in a disk array storage device wherein at least two physical disk storage devices store data in a plurality of logical volumes and each physical disk storage device responds to a data transfer request to read or write data. The method begins by defining the length of an analysis interval and generating, for each subinterval, operational data including the number of accesses to each logical volume on predetermined ones of the physical disk storage devices in response to data transfer requests. The method converts the operational data obtained during each subinterval into disk utilization values for each predetermined physical disk storage device and each logical volumes in the predetermined physical disk storage devices. An analysis of the disk utilization values provides a selection of a pair of logical volumes that, if exchanged, would improve load balance for the predetermined physical storage devices. The method terminates after exchanging the selected logical volumes automatically.
In accordance with yet another aspect of this invention, it is possible to balance loads on physical disk storage devices in a disk array storage device wherein at least two physical disk storage devices store data in a plurality of logical volumes and each physical disk storage device responds to a data transfer request to read or write data. The method begins by defining the length of a first analysis interval and a second analysis interval that includes a plurality of first analysis intervals. Operational data is generated for each of the first and second analysis intervals, and this operational data includes the number of accesses to each logical volume on predetermined ones of the physical disk storage devices in response to data transfer requests. An automatic process runs to exchange the data in a pair of logical volumes upon the completion of each first analysis interval in response to the processing of predetermined operational data during each first analysis interval. An exchange of data in a pair of logical volumes also occurs upon the completion of each second analysis interval.