Businesses employ large scale data centers for storing and processing their business critical data. These data centers often consist of a layer of hosts (e.g., computer servers) coupled to a layer of data storage subsystems via a storage area network (SAN). This background and the detailed description below will be described with reference to a data center consisting of a single host coupled to a layer of data storage subsystems via a SAN, it being understood that the present invention should not be limited thereto.
FIG. 1 shows in block diagram form relevant components of an exemplary data center 10. The operational aspects of data center 10 described below should not be considered prior art to the claims set forth herein. Data center 10 includes host 12 coupled to data storage subsystems 16-20 via SAN 22. SAN 22 may consist of several devices (e.g., routers, switches, etc.) for transmitting input/output (IO) transactions or other data between host 12 and data storage subsystems 16-20. For purposes of explanation, FIG. 1 shows one device (e.g., a switch) 14 within SAN 22 it being understood that the term SAN should not be limited thereto.
Each of the data storage subsystems 16-20 includes several physical storage devices. For purposes of explanation, the physical storage devices take form in hard disks, it being understood that the term physical storage device should not be limited to hard disks. Data storage subsystems 16-20 may take different forms. For example, data storage system 16 may consist of “just a bunch of disks” (JBOD) connected to and accessible by an array controller card. Data storage subsystem 18 may consist of an intelligent disk array. Data storage system 20 may consist of a block server appliance. For purposes of explanation, each of the data storage subsystems 16-20 will take form in an intelligent disk array (hereinafter disk array), it being understood that the term data storage subsystem should not be limited thereto.
As noted, each of the disk arrays 16-20 includes several hard disks. The hard disk is the most popular storage device currently used. A hard disk's total storage capacity is divided into many small chunks called physical memory blocks. For example, a 10 GB hard disk contains millions of physical memory blocks, with each block able to hold 512 bytes of data. Any random physical memory block can be written to or read from in about the same amount of time, without having to first read from or write to other physical memory blocks. Once written, a physical memory block continues to hold data even after the hard disk is powered down.
Host node 12 includes an application 26 which is configured to generate IO transactions for accessing data in one or more logical data volumes (more fully described below). Host 12 also includes a storage manager 30 coupled to volume descriptions memory 40 and volume specifications memory 50. FIG. 1 also shows that each of the disk arrays 16-20 have their own storage managers. The storage managers in disk arrays 16-20 are coupled to respective virtual disk descriptions memories. Each of the storage managers shown in FIG. 1 may take form in software instructions executing on one or more processors. Volume Manager™ provided by VERITAS Software Corporation of Mountain View, Calif., is the exemplary storage manager, it being understood that the term storage manager should not be limited thereto.
Storage managers can create storage objects. For example, storage managers can create storage objects called virtual disks from hard disks. To illustrate, storage managers in disk arrays 16-20 logically aggregate hard disks to create virtual disks. Virtual disks typically have better characteristics (e.g., higher storage capacity, greater effective data transfer rates, etc.) than individual hard disks. Storage managers can also logically aggregate virtual disks to create other storage objects. For example, storage managers can aggregate virtual disks to create storage objects called data volumes. To illustrate, storage manager 30 can logically aggregate virtual disks provided by disk arrays 16-20 to create a volume VE, more fully described below.
Virtual disks and other storage objects are abstractions and each can be viewed as an array of logical memory blocks that store or are configured to store data. While it is said that a logical memory block stores or is configured to store data, in reality the data is stored in at least one physical memory block of a hard disk mapped directly or indirectly to the logical memory block. Configuration maps or algorithms may be used to map logical memory blocks of a virtual disk or other storage object to physical memory blocks.
As noted, storage manager 30 can aggregate virtual disks provided by disk arrays 16-20 to form logical volume VE. In general logical volumes are presented for direct or indirect access by an application such as application 26 executing on host 12. Application 26 can generate IO transactions to read data from or write data to logical memory blocks of a data volume not knowing that the data volume is a logical aggregation of underlying virtual disks, which in turn may be logical aggregations of hard disks within disk arrays 16-20.
Logical volumes are created by storage manager 30 according to the requirements of specifications (also known as intents) provided thereto. Logical volume specifications define how underlying virtual disks are to be aggregated. The more common forms of aggregation include concatenated storage, striped storage, mirrored storage, or RAID storage. A more complete discussion of how virtual disks can be aggregated can be found within Dilip M. Ranade [2002], “Shared Data Clusters,” Wiley Publishing, Inc., which is incorporated herein by reference in its entirety. Specifications may further include aggregation rules that ensure a desired performance (e.g., greater effective data transfer rates) and/or data availability. For example, a specification for a mirrored volume may have a fault tolerance rule or a rule which requires that the constituent mirrors do not share hard disks in order to ensure data availability notwithstanding a failure of a hard disk used to store data of the mirrored volume. A specification for a striped volume may have a disk confinement rule which requires that each column is formed directly or indirectly from hard disks contained in a single disk array. A specification for a volume may have a performance based rule which requires underlying hard disks of the volume to be accessed through respective disk array controllers. The overall speed for accessing data will be greater if each underlying hard disk is accessed via a respective disk array controller when compared to the access speed for underlying hard disks which share disk array controllers. Other volume aggregation rules are contemplated.
A logical volume description is created for each logical volume. Logical volume descriptions may be stored in volume descriptions memory 40. In general, a logical volume description defines the relationship of a logical volume to its underlying virtual disks or other storage objects. The description may include a configuration map or algorithm that can be used to map each logical memory block of the logical volume to one or more logical blocks of one or more underlying virtual disks or other storage objects. Storage manager 30 uses configuration maps or algorithms to translate IO transactions that access a logical volume into one or more IO transactions that access one or more underlying virtual disks or other storage objects. Consider for example a two-way mirrored volume VE created by storage manager 30. First and second mirrors of volume VE are formed from logical storage in virtual disks M1E and M2E, respectively, provided by disk array 16. Volume VE is structured to consist of nmax logical memory blocks. Storage manager 30 creates a configuration map for volume VE and stores the configuration map into memory 40. The configuration map maps each logical block x of volume VE to respective logical blocks x in virtual disks M1E and M2E. When storage manager 30 receives an IO transaction to write data D to logical memory block 3 of volume VE, storage manager 30 accesses the configuration map for volume VE to learn that logical memory block 3 in volume VE is mapped to respective logical blocks 3 in virtual disks M1E and M2E. Storage manager 30 can then generate separate IO transactions to write data D to block 3 in virtual disks M1E and M2E.
A virtual disk description is created for each virtual disk created in the disk arrays 16-20. These virtual disk descriptions may be stored in virtual disk description memories 42-46. The virtual disk description defines the relationship of a virtual disk to its underlying hard disks. The virtual disk description may also include a configuration map or algorithm that can be used to map each logical memory block of the virtual block to one or more physical memory blocks of one or more underlying hard disks. Storage managers 32-36 use configuration maps or algorithms to translate IO transactions that access a virtual disk into one or more IO transactions that access one or more underlying hard disks. Consider for example, concatenated virtual disk M1E created by storage manager 32 from underlying hard disks d1 and d2 (not shown) of disk array 16. Virtual disk M1E consists of nmax logical memory blocks. Storage manager 32 creates a configuration map for virtual disk M1E and stores the configuration map into memory 42. The configuration map maps each logical block x of virtual disk M1E to a physical block y in hard disk d1 or d2. When storage manager 32 receives an IO transaction to write data D to, for example, logical memory block 3 of virtual disk M1E, storage manager 32 accesses the configuration map for virtual disk M1E to learn that logical memory block 3 is mapped to, for example, physical memory block 565 in hard disk d2. Storage manager 32 can then generate an IO transaction to write data D to block 565.
It is noted that the configuration of virtual disks or other storage objects can change over time. For example, data in hard disk d2 of virtual disk M1E described above may be evacuated to hard disk d3 in disk array 16. When the configuration of a virtual disk changes, the corresponding virtual disk description is updated to reflect the changes. Storage managers of the data center shown in FIG. 1 are independent and don't communicate with each other after a virtual disk reconfiguration, and this can lead to problems. To illustrate, presume that storage manager 30 creates two-way mirrored volume VE consisting of nmax logical memory blocks. The specification for mirrored volume VE requires fault tolerance (i.e., each mirror must be formed from virtual disks that do not share underlying hard disks). The mirrors of volume VE are formed from logical storage space of virtual disks M1E and M2E, respectively, provided by disk array 16. Further, virtual disk M1E is created as a concatenation of storage in hard disks d1 and d2 within disk array 16, while virtual disk M2E is created as a concatenation of storage in hard disks d3 and d4 within disk array 16. Presume that hard disks d3 and d4 are large when compared to hard disks d1 and d2 and contain a substantial portion of unused physical storage space. Since virtual disks M1E and M2E do not share hard disks to store mirrored volume VE data, volume VE is initially consistent with its fault tolerance rule.
Presume that after creation of mirrored volume VE, storage manager 32 evacuates volume VE data from hard disk d1 to hard disk d4. Once the evacuation is completed, storage manager 32 updates the description for virtual disk M1E to indicate that it is an aggregation of hard disks d2 and d4. Host node 12, however, is not made aware of the reconfiguration of virtual disk M1E. The reconfiguration of virtual disk M1E, however, results in a violation of mirrored volume VE's specification rule that volume VE's mirrors do not share hard disks.