1. Field of the Invention
This invention relates in general to storage systems, and more particularly to a method, apparatus and program storage device for providing geographically isolated server failover between mirrored virtual disks using an instant RAID swapping technique.
2. Description of Related Art
A computer network is a connection of points (e.g., a plurality of computers) that have been interconnected by a series of communication paths. Moreover, any number of individual computer networks may be interconnected with other computer networks, which may increase the complexity of the overall system. Generally, computer networks may be used to increase the productivity of those computers that are connected to the network. The interconnection of the various points on the computer network may be accomplished using a variety of known topologies. Generally, a host computer (e.g., server) may function as a centralized point on the network. For example, using any of the network topologies discussed above, a plurality of client computers may be interconnected such that the server controls the movement of data across the network. The host computer may have an operating system that may be used to execute a server application program that is adapted to support multiple clients. Typically, the server may service requests from a plurality of client computers that are connected to the network. Furthermore, the server may be used to administer the network. For example, the server may be used to update user profiles, establish user permissions, and allocate space on the server for a plurality of clients connected to the network.
In many computer networks, a large amount of data may be stored on the server and accessed by the attached client computers. For example, each client computer may be assigned a variable amount of storage space on a server. The administration of a storage system is often a complex task that requires a great deal of software and hardware knowledge on the part of the administrator. Given a pool of storage resources and a workload, an administrator must determine how to automatically choose storage devices, determine the appropriate device configurations, and assign the workload to the configured storage. These tasks are challenging, because the large number of design choices may interact with each other in poorly understood ways.
The explosion of data being used by businesses is making storage a strategic investment priority for companies of all sizes. As storage takes precedence, concern for business continuity and business efficiency has developed. Two new trends in storage are helping to drive new investments. First, companies are searching for more ways to efficiently manage expanding volumes of data and make that data accessible throughout the enterprise. This is propelling the move of storage into the network. Second, the increasing complexity of managing large numbers of storage devices and vast amounts of data is driving greater business value into software and services. A Storage Area Network (SAN) is a high-speed network that allows the establishment of direct connections between storage devices and processors (servers) within the distance supported by Fibre Channel. SANs are the leading storage infrastructure for the world of e-business. SANs offer simplified storage management, scalability, flexibility, availability, and improved data access, movement, and backup.
It is common in many contemporary storage networks to require continuous access to stored information. The conventional method of taking data storage systems offline to update and backup information is not possible in continuous access storage networks. However, system reliability demands the backup of crucial data and fast access to the data copies in order to recover quickly from human errors, power failures, hardware failure and software defects. In order to recover from geospecific disasters, it is common to share data among geographically dispersed data centers.
One method to meet data backup and sharing needs uses data replication in which a second copy or “mirror” of information located at a primary site is maintained at a secondary site. This mirror is often called a “remote mirror” if the secondary site is located away from the primary site. When changes are made to the primary data, updates are also made to the secondary data so that the primary data and the secondary data remain “synchronized”, preventing data loss if the primary site goes down. For even more security, multiple copies of the data may be made at the secondary or even tertiary sites.
A virtual disk drive is a set of disk blocks presented to an operating environment as a range of consecutively numbered logical blocks with disk-like storage and I/O semantics. The virtual disk is the disk array object that most closely resembles a physical disk from the operating environment's viewpoint. In a storage network implementing virtual disks, a source virtual disk may be copied to another (destination) virtual disk at an extremely high rate. While data is being copied to the destination virtual disk, the source drive remains online and accessible, responding to all I/O requests, continually mirroring write requests to the destination virtual disk as well. When the copy operation completes, a mirrored virtual disk set exists, which includes the source and the destination virtual disk. The destination virtual disk continues to mirror the source virtual disk until the connection between the two is broken.
In contemporary raid storage arrays that support block level mirroring of virtual disks, any catastrophic failure of the raid arrays that are the destination of mirrors will typically not affect continuous data access to the primary virtual disks in any way, since they are essentially considered backups and are never read from. Failure of the primary raid array is a totally different matter and will generally require intervention in some form or another to allow servers to continue to access their ‘backup’ storage. Typically this is neither seamless or inexpensive as it is very server specific and will involve significant up front server costs in terms of server failover software, redundant servers, and risky to use and often error prone due to custom approaches that attempt to cover the typical types of failures and fail miserably to accommodate the unexpected types of failures. These approaches also tend to have extremely long recovery (rebuild) times and extensive periods of time where systems run at much reduced redundancy levels.
The need exists to mirror virtual disks in such a way that within a single storage system that is geographically dispersed (i.e. controllers and drive bays separated within a building or between buildings), mission critical virtual disk access continues even through the loss of ANY one location of storage (i.e. including the primary location). The need also exists to improve performance of mirrored partners during failed disk rebuilds and reduce the recovery times of temporary loss to major portions of the physical storage (communications breaks between buildings). Luckily these needs can be addressed in virtualized storage arrays that allow for the concepts of instantly swapping ‘mirrored’ raid arrays from within their data structures.
It can be seen then that there is a need for a method, apparatus and program storage device for providing geographically isolated failover using instant RAID swapping in mirrored virtual disks.