1. Technical Field
This application generally relates to controlling multi-step storage management operations.
2. Description of Related Art
Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as those included in the data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more servers or host processors and provide storage services to each host processor. Multiple data storage systems from one or more different vendors may be connected and may provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data in the device. In order to facilitate sharing of the data on the device, additional software on the data storage systems may also be used.
RAID (Redundant Array of independent or Inexpensive Disks) parity schemes may be utilized to provide error detection during the transfer and retrieval of data across a storage system (also known as storage arrays or arrays).
In the industry there have become defined several levels of RAID systems. The first level, RAID-0, combines two or more drives to create a larger virtual disk. In a dual drive RAID-0 system one disk contains the low numbered sectors or blocks and the other disk contains the high numbered sectors or blocks, forming one complete storage space. RAID-0 systems generally interleave the sectors of the virtual disk across the component drives, thereby improving the bandwidth of the combined virtual disk. Interleaving the data in that fashion is referred to as striping. RAID-0 systems provide no redundancy of data, so if a drive fails or data becomes corrupted, no recovery is possible short of backups made prior to the failure.
RAID-1 systems include one or more disks that provide redundancy of the virtual disk. One disk is required to contain the data of the virtual disk, as if it were the only disk of the array. One or more additional disks contain the same data as the first disk, providing a “mirror” of the data of the virtual disk. A RAID-1 system will contain at least two disks, the virtual disk being the size of the smallest of the component disks. A disadvantage of RAID-1 systems is that a write operation must be performed for each mirror disk, reducing the bandwidth of the overall array. In a dual drive RAID-1 system, the first disk and the second disk contain the same sectors or blocks, each disk holding exactly the same data.
RAID-2 systems provide for error correction through hamming codes. The component drives each contain a particular bit of a word, or an error correction bit of that word. RAID-2 systems automatically and transparently detect and correct single-bit defects, or single drive failures, while the array is running. Although RAID-2 systems improve the reliability of the array over other RAID types, they are less popular than some other systems due to the expense of the additional drives, and redundant onboard hardware error correction.
RAID-4 systems are similar to RAID-0 systems, in that data is striped over multiple drives. For example, the storage spaces of two disks are added together in interleaved fashion, while a third disk contains the parity of the first two disks. RAID-4 systems are unique in that they include an additional disk containing parity. For each byte of data at the same position on the striped drives, parity is computed over the bytes of all the drives and stored to the parity disk. The XOR operation is used to compute parity, providing a fast and symmetric operation that can regenerate the data of a single drive, given that the data of the remaining drives remains intact. RAID-3 systems are essentially RAID-4 systems with the data striped at byte boundaries, and for that reason RAID-3 systems are generally slower than RAID-4 systems in most applications. RAID-4 and RAID-3 systems therefore are useful to provide virtual disks with redundancy, and additionally to provide large virtual drives, both with only one additional disk drive for the parity information. They have the disadvantage that the data throughput is limited by the throughput of the drive containing the parity information, which must be accessed for every read and write operation to the array.
RAID-5 systems are similar to RAID-4 systems, with the difference that the parity information is striped over all the disks with the data. For example, first, second, and third disks may each contain data and parity in interleaved fashion. Distributing the parity data generally increases the throughput of the array as compared to a RAID-4 system. RAID-5 systems may continue to operate though one of the disks has failed. RAID-6 systems are like RAID-5 systems, except that dual parity is kept to provide for normal operation if up to the failure of two drives.
Combinations of RAID systems are also possible. For example, a four disk RAID 1+0 system provides a concatenated file system that is also redundant. The first and second disks are mirrored, as are the third and fourth disks. The combination of the mirrored sets forms a storage space that is twice the size of one individual drive, assuming that all four are of equal size. Many other combinations of RAID systems are possible.
In at least some cases, when a logical volume is configured so that its data is written across multiple disk drives in the striping technique, the logical volume is operating in RAID-0 mode. Alternatively, if the logical volume's parity information is stored on one disk drive and its data is striped across multiple other disk drives, the logical volume is operating in RAID-3 mode. If both data and parity information are striped across multiple disk drives, the logical volume is operating in RAID-5 mode.
In a common implementation, a Storage Area Network (SAN) is used to connect computing devices with a large number of storage devices. Management and modeling programs may be used to manage these complex computing environments.
Storage Management Initiative Specification (SMI-S), and Common Information Model (CIM) technologies, are widely used for managing storage devices and storage environments. CIM is described further below. The SMI-S is a standard management interface that allows different classes of hardware and software products to interoperate for monitoring and controlling resources. For example, the SMI-S permits storage management systems to identify, classify, monitor, and control physical and logical resources in a SAN. The SMI-S is based on CIM, and Web-Based Enterprise Management (WBEM) architecture. CIM is a model for describing management information, and WBEM is an architecture for using Internet technologies to manage systems and networks. The SMI-S uses CIM to define objects that represent storage entities such as Logical Unit Numbers (LUNs), disks, storage subsystems, switches, and hosts. (In many, but not all cases, the term “volume” or “logical volume” is interchangeable with the term “LUN”.) CIM also defines the associations that may or may not exist between these objects, such as a disk being associated to a storage subsystem because it physically resides in the storage subsystem.
The CIM objects mentioned above may be managed by a CIM object manager (CIMOM). A storage management software application can use a CIM client to connect to a CIMOM, to retrieve information about the storage entities that the CIMOM manages, and also to perform active configuration of the storage entities. Storage management software that uses a CIM client may be called a CIM client application.
For example, SMI-S describes how a current storage LUN is mapped. A CIM server is a CIMOM and a set of CIM providers. The SMI-S describes several methods for assigning a LUN from a disk storage system to a host, or for adding a LUN to a disk storage system.
For example, the SMI-S describes how to add a LUN to a disk storage system, wherein the method CreateOrModifyElementFromStoragePool( ) in the StorageConfigurationService object is used to create a LUN (or storage volume) given the LUN type, the size of the LUN, a storage pool CIM object path and the StorageConfigurationService. The resulting LUN can then be assigned to a host or several hosts available to the disk storage system. Details of the CreateOrModifyElementFromStoragePool( )method are as follows:
CreateOrModifyElementFromStoragePool                uint32 CreateOrModifyElementFromStoragePool        {                    [in, Values {“Unknown”, “Reserved”, “StorageVolume”, “StorageExtent”, “DMTF Reserved”,“Vendor Specific”},                            ValueMap {“0”, “1”, “2”, “3”, “ . . . ”, “0x8000 . . . ”}] uint16 ElementType; [out] CIM_ConcreteJob ref Job, [in] CIM_StorageSetting ref Goal, [in, out] uint64 Size, [in] CIM_StoragePool ref InPool, [out, in] CIM_LogicalElement ref Element};                                    [out] CIM_ConcreteJob ref Job,            [in] CIM_StorageSetting ref Goal,            [in, out] uint64 Size,            [in] CIM_StoragePool ref InPool,            [out, in] CIM_LogicalElement ref Element)                        };        
This method allows an element of a type specified by the enumeration ElementType to be created from the input storage pool. The parameters are as follows:
ElementType: This enumeration specifies what type of object to create. StorageVolume and StorageExtents are defined as values.
Job: Reference to the completed job.
Goal: This is the service level that the storage volume is expected to provide. The setting must be a subset of the capabilities available from the parent storage pool. Goal may be a null value in which case the default setting for the pool will be used.
Size: As an input this is the desired size of the storage volume. If it is not possible to create a volume of the desired size, a return code of “Size not supported” will be returned with size set to the nearest supported size.
InPool: This is a reference to a source storage pool.
Element: If a reference is passed in, then that element is modified, otherwise this is a reference to the created element.
Generally, there is substantial complexity when using the CIM object model to create a LUN. For example, the StoragePool object does not have a direct association to the ComputerSystem's StorageConfigurationService, so the StoragePool object has to be first associated to the ComputerSystem, and the ComputerSystem then has to be associated to the StorageConfigurationService to associate a StoragePool to a StorageConfigurationService. Also, because the ComputerSystem can represent more than just a disk storage system, the correct ComputerSystem must be located before making the association to the StorageConfigurationService.
Developing and unifying management standards for desktop, enterprise and Internet environments is a main goal of the Distributed Management Task Force Inc. (DMTF). DMTF standards are platform-independent and technology neutral, and facilitate cost effective system management. The DMTF's CIM standard is an object-oriented management information model that unifies and extends existing management standards, such as for example, Simple Network Management Protocol (SNMP), Desktop Management Interface (DMI), and Common Management Information Protocol (CMIP). The CIM specification defines the syntax and rules of the model and how CIM can be integrated with other management models, while the CIM schema comprises the descriptions of the models.
The CIM standard schema may define thousands of classes with properties and associations for logical and physical modeling. The schema may represent one or many components of an information handling system including, but not limited to, fans, power supplies, processors, and firmware. The CIM schema class definitions also include methods. Organization of the classes is accomplished by use of namespaces, which function as logical databases. DMTF Profiles are specifications that define the CIM model and associated behavior for a management domain. The profiles define requirements regarding the classes and associations used to represent the management information in a given management domain. Generally, within a CIMOM, profiles are implemented by different providers in one or more namespaces. The CIMOM provides an interface, which allows a provider to expose the instances of CIM classes and a client application to read and/or write properties and invoke methods.
Many of the CIM methods include management tasks, such as, for example but not limited to, updates and diagnostics. Many of the methods and tasks/jobs may require a long period of time in order to be completed. As used herein, the words “task” and “job” may be used interchangeably. In a CIM environment, a provider may return a job handle to a client using the “Job” output parameter on the invoked CIM method, thereby effectively making the invocation asynchronous. The job handle may be represented by a CIM reference to an instance of a CIM class arbitrarily named CIM_ConcreteJob. The reference may be used at any time by a client to request an actual instance of CIM_ConcreteJob, and to check the status of a job.
DMTF also specifies CIM operations over HTTP, which include CIM multiple operations. A multiple operation is defined as one that requires the invocation of more than one CIM method. A multiple operation request is represented by a <MULTIREQ> element, and a multiple operation response by a <MULTIRSP> element. A <MULTIREQ> (respectively, <MULTIRSP>) element is a sequence of two or more <SIMPLEREQ> (respectively, <SIMPLERSP>) elements. A <MULTIRSP> element contains a <SIMPLERSP> element for every <SIMPLEREQ> element in the corresponding Multiple Operation Response, and these <SIMPLERSP> elements are in the same order as their <SIMPLEREQ> counterparts (so the first <SIMPLERSP> in the response corresponds to the first <SIMPLEREQ> in the request, and so forth).
Multiple operations provide a convenient mechanism whereby multiple method invocations may be batched into a single HTTP Message, thereby reducing the number of roundtrips between a CIM client and a CIM server and allowing the CIM server to make internal optimizations. Multiple operations do not confer any transactional capabilities in the processing of the request (for example, there is no requirement that the CIM server guarantee that the constituent method calls either all failed or all succeeded, only that the entity make a “best effort” to process the operation). However, servers process each operation in a batched operation to completion before executing the next operation in the batch. Thus the order of operations specified within a batched operation is significant.
In general, tasks such as assigning a LUN from a disk storage system to a host, and adding a LUN to a disk storage system, can be complex to execute. Other example tasks may include otherwise allocating storage, specifying the logical and/or physical devices used for the storage allocation, specifying whether the data should be replicated, the particular RAID level, and the like.