The present invention relates generally to storage networks, and more particularly to systems and methods for optimizing performance in networks including geographically remote components and/or limited bandwidth connection links.
An increasingly globalized economy results in pressure for organizations to share resources. And, in an era when information is one of the most valuable resources an organization possesses, sharing electronic data storage is becoming an imperative. The sharing may need to occur between multiple sites of a single organization, or between different organizations that share common objectives, or between organizations that share nothing in common except the desire to purchase reliable and inexpensive data storage from a third party.
Opportunities for organizations that can efficiently share storage resources include:
1. Reduced transactional latency: In many applications, a single data transaction can initiate a cascade of tens or even hundreds of other automated data transactions. Since transcontinental and intercontinental transport for a single transaction results in latencies of a tenth of a second or more, cumulative transport latency can easily become unacceptable. Consequently, storing data close to the businesses and customers that need it when they need it makes good sense.
2. Improved storage management: Increasingly, an important bottleneck to scaling storage networks is the lack of skilled storage management professionals. If the storage resources of the multi-site network can be accessed and managed by controller subsystems at any given site, significant savings would result.
3. Improved availability and business continuity: If a storage subsystem from any given site can compensate for failures that occur in sister subsystems at other sites, the extended network can achieve greater fault tolerance at less expense. Also, in the event of a disaster affecting any single site, it is important that the other sites be able to compensate seamlessly without any disruption to their normal operation except that they must handle a greater workload.
4. Reduced congestion and improved performance: Centralized storage can create an unnecessary bottleneck in data distribution. This is particularly true of data centers devoted to applications involving large block sequential content (e.g., video-on-demand applications).
5. Improved use of corporate resources: Centralized storage often fails to exploit the existing network and storage resources of multi-site organizations. Also, it is often valuable for the geographic distribution of storage to mirror the geographic distribution of business units within a company.
Unfortunately, most organizations are not able to realize these opportunities because of limitations inherent to conventional storage network architectures.
FIG. 1 shows an example of a logical layout for a conventional storage area network (SAN). In this example, application servers 10 are connected through a Fibre Channel (FC) fabric to an array of storage devices 20. In this case, FC switches 30 provide any-to-any connectivity between the servers 10 and logical storage devices 20, each of which might, for example, represent an array of disks. A Redundant Array of Independent Disk (RAID) controller 40 manages each logical storage device 20 in FIG. 1. The RAID controller function shown in the FIG. 1 is meant to represent a logical controller function that may be implemented in software, hardware, or some combination of both. The RAID controller function is a special case of an Array Management Function (AMF). The array of storage devices managed by-a given AMF is known as “Redundancy Group” (RG). In general, the AMF is responsible for access and management of one or more RGs.
“Array Management Function” (AMF) generally refers to the body that provides common control and management for one or more disk or tape arrays. An AMF presents the arrays of tapes or disks it controls to the operating environment as one or more virtual disks or tapes. An AMF typically executes in a disk controller, an intelligent host bus adapter or in a host computer. When it executes in a disk controller, an AMF is often referred to as firmware. One or more AMFs can execute in each controller, adapter or host as desired for the particular application.
“Redundancy Group” (RG) generally refers to a collection of logical or physical storage entities organized by an AMF for the purpose of providing data protection. Within a given RG, a single type of data protection is used. All the user data storage capacity in a RG is protected by check data stored within the group, and no user data capacity external to a RG is protected by check data within it. RGs typically include logical entities composed of many resources such as stripes, data blocks, cached data, map tables, configuration tables, state tables, etc.
“Redundancy Group Management” generally refers to the responsibilities, processes and actions of an AMF associated with a given redundancy group.
While there are many variants on the typical SAN architecture shown in FIG. 1, one element of note here is that each RG is managed by only one AMF. This AMF is said to be the “logical owner” of the given RG.
An important consequence is that when an AMF fails, users lose access and control of the RGs for which it had ownership. Some conventional storage network architectures address this problem by having responsibility for RGs transfer to new AMFs in the event of a failure of their logical owner. Other Storage networking systems employ a “Master/Slave” architecture in which two or more AMFs may have access to a given storage array, however, changes to the storage array (e.g., writing of data, re-build of a failed disk, expansion of the array, etc.) are managed exclusively through the “Master” AMF.
When a storage network is implemented in a multi-site configuration, additional constraints imposed by the “Master/Slave” architecture for RG management become apparent. Suppose, for instance, that a given RG is composed of storage resources from two sites. Unless the Master AMF is “geographically aware”, read requests may be routed to remote storage resources even when the requisite data is available locally. The result is unnecessary penalties in terms of response time, performance, and wide area bandwidth usage. Also, suppose that users at the remote site wish to perform write operations or control and management functions on the RG that are reserved for the Master AMF. Traffic associated with these functions must be routed through the remote site, again resulting in unnecessary penalties for local users.
Typically, in multi-site storage networks using the Master/Slave architecture for RG management, the remote mirrors of a logical volume within a redundancy group are necessarily ‘read-only’ unless the primary logical volumes fail. Geographically distributed RGs are, in fact, only providing passive remote mirrors to primary data stored locally. Such systems typically do not allow the user to mount RGs that might include primary storage at multiple sites, striping across multiple sites, or even primary storage that is entirely remote from the site at which the Master AMF resides.
U.S. Pat. No. 6,148,414, which is hereby incorporated by reference in its entirety, describes a novel storage networking architecture in which multiple AMFs maintain peer-to-peer access of shared RGs.
FIG. 2 shows a sample network configuration incorporating multiple AMFs in which the teachings of U.S. Pat. No. 6,148,414 may be implemented. A plurality of network clients (not shown) is communicably coupled with a plurality of servers 110, each of which is, in turn, coupled to a plurality of AMFs (resident in the AMF Blades or “NetStorager” cards 115 as shown in FIG. 2). These AMFs (resident on blades 115) are, in turn, connected though a switch fabric 130 to a plurality of storage resources 120.
In the architecture of FIG. 2, the AMFs provide concurrent access to the redundancy groups for associated host systems. When a host (e.g., network client device or server 110) requests an AMF to perform an operation on a resource, the AMF synchronizes with the other AMFs sharing control of the redundancy group that includes the resource to be operated on, so as to obtain a lock on the resource. While performing the operation, the AMF sends replication data and state information associated with the resource to the other AMFs sharing control of the redundancy group such that if the AMF fails, any of the other AMFs are able to complete the operation and maintain data reliability and coherency.
Another key element of the storage network architecture described by U.S. Pat. No. 6,148,414 is that multiple AMFs not only share access to a given RG, they also share management of it as peers. So, for-example, the architecture incorporates an algorithm by which multiple AMFs arbitrate for responsibility to reconstruct the redundancy group when one of its disks fails. Also, the architecture includes an algorithm by which a redundancy group can be expanded to include an additional disk, for example.
Such systems, however, tend to be insensitive to the geographic location of the various components of the storage network. It is therefore desirable to provide systems and methods to optimize storage network functionality for cases in which some components of the network may be separated by significant distances and/or which include communication links with relatively limited bandwidth.