Individual storage devices are used to store data and typically include hard drives, compact disk (CD) drives, tape drives and others. Some of these types of storage devices, particularly hard drives, are commonly grouped together in a storage array. A storage array is a group of storage devices, typically two to eight, that function cooperatively together, such as in a RAID (Redundant Array of Independent Drives) configuration. Typically, the storage devices in the storage array are installed together in a single unit, such as a storage server or “storage box.” The storage array has greater storage capacity and data transfer speed than does an individual storage device, so the storage array can service software applications that have greater storage requirements than can the individual storage device.
Individual storage arrays, however, do not have the high bandwidth and transaction rates required by some current high-capacity software applications. “Bandwidth” typically refers to the total amount of data that can be transferred into or out of the storage array per unit of time. “Transaction rate,” however, typically refers to the total number of separate data accesses or I/O (Input/Output) requests that can be serviced by the storage array per unit of time. A single storage device has a bandwidth and transaction rate capacity that is insufficient for many modern software applications. By combining more than one storage device into the storage array, the storage devices can be accessed in parallel for a much greater overall bandwidth and transaction rate. Thus, a volume of data (e.g. a file or database) within the storage array is divided into multiple sections, which are each stored on different storage devices within the storage array, so the data on each storage device can be accessed simultaneously (i.e. in parallel) with each other. However, some current high-capacity software applications have such high bandwidth and/or transaction rate requirements that even the storage array cannot satisfy them.
For the current high-capacity software applications, multiple storage arrays are combined into a consolidated storage array (CSA), so that the storage arrays within the CSA can be accessed in parallel with each other for a much greater overall bandwidth and transaction rate than is possible with a single storage array. Thus, the data volume is divided up and allocated to more than one of the storage arrays of the CSA to achieve the desired bandwidth and transaction rates for access to the data volume. Typically, the data volume is established with data striping and redundancy techniques to ensure against loss of the data. Additionally, the CSA is connected through a communication network, such as a switched fabric, to one or more host devices that execute the high-capacity software applications. The communication network has a communication rate that is high enough to satisfy multiple applications executing on the host devices by accessing multiple data volumes on the CSA simultaneously without loss of performance.
When using such high-capacity software applications, the user (e.g. the person using the high-capacity software application) must create the data volume within the CSA and supply a definition of the data volume to the host device. Striping software executing on the host device must be configured with the identification of the storage arrays across which the data volume is striped and the definition of the data volume. The procedure for creating such a data volume striped across multiple devices is very time-consuming and prone to human error, due to the amount of human interaction required.
To create the data volume within the CSA, the user must determine the parameters necessary for the desired data volume according to the needs of the high-capacity software application. Such parameters typically include size, bandwidth, transaction rate, redundancy and other attributes. The user must then analyze the storage arrays of the CSA to determine which storage arrays are available, the amount of unused storage space on the available storage arrays, the current bandwidth usage of other applications that access existing data volumes on the storage arrays, and the remaining bandwidth capacity. Since usually not all of the storage arrays of the CSA are utilized in exactly the same manner, the user must manually add up the storage space, bandwidth and transaction rate capacities for the available storage arrays to determine which ones of the storage arrays can be grouped together to form the data volume with the required parameters. The user must also typically take into consideration a balancing of the data access loads on each of the storage arrays. Once the host device and each of the storage arrays has been properly configured with the definition for the data volume, the host device may begin accessing the data volume. Additionally, if more than one host device will be executing an application that requires access to the same data volume, then the user must configure the striping software of each host device with the definition of the data volume.
The user may intentionally overestimate the necessary parameters for the data volume in order to account for errors in the analysis of the available storage arrays and to reduce the need to make future changes to the data volume as usage of the storage arrays by any other applications executing on the host devices also changes. Such errors may occur since the other applications utilizing the same storage arrays may not always access the data volumes stored thereon to the fullest extent anticipated when the data volumes were created. Thus, the user may get a false view of the available bandwidth or transaction rate capacity of some of the storage arrays. Subsequently, when performance of the applications and usage of the data volumes are at a peak, unacceptable degradation of the performance of the applications may occur.
Additionally, some of the other applications utilizing the same storage arrays may change the usage of their existing data volumes in the CSA, and newly executed applications may be started on the host devices with new data volumes created in the CSA. As a result, overall usage of the storage arrays can change suddenly. The user must, therefore, be aware of whether any of the storage arrays are nearing or have surpassed their maximum capacity. In this case, the user may have to change the combination of storage arrays on which the data volume resides. Therefore, after creating the data volume, the user must continuously monitor the performance of the CSA to ensure that the storage arrays are servicing the application according to the required parameters for data access. Before changing the combination of storage arrays for the data volume, however, the user must repeat the time-consuming analysis of the CSA to determine which storage arrays are available and which combination of storage arrays will safely meet the necessary parameters. Then, when a new combination of storage arrays has been chosen, the user must carefully orchestrate the transfer of affected portions of the data volume (preferably during off-peak times) to avoid undesirable effects on the performance of the application. Additionally, if more than one host device is executing an application that needs to access the changed data volume, then the user must reconfigure the striping software of each of these host devices with the new definition of the data volume to be able to access the correct storage arrays for the data volume.
It is with respect to these and other background considerations that the present invention has evolved.