Current mass storage systems, henceforth referred to as storage systems, are typically external to host computers and other applications. These storage systems face increasing demands that require ever increasing storage volume capability, and performance in access, giving rise to issues of scalability in capacity, performance, flexibility in changing to application needs, growth and ease of management of the storage devices.
A typical storage system has a centralized storage infrastructure that allows multiple servers to access a large volume of centrally managed storage information. Centrally managed storage systems enable automation of the administration of storage services allowing efficient storage management. One approach to centralization is a virtualization technique that provides the storage users with a virtual pool of storage, from which a logical volume of storage can be offered to the application or user.
A typical storage system must be able to meet the numerous and changing needs of the users such as the ability to be easily scale to meet the demand of growing application data. Additionally, the ability of the storage system to allow growth of I/O (input/output) bandwidth is also a concern. For many applications the information stored by the storage system must be constantly available to avoid unplanned downtime.
As the typical storage system expands capacity, the capability to access the stored information must increase as well. In this manner, growing storage capacity and users increase the need for automated storage management, which typically includes provisioning storage flexibly with different service levels to applications, creating replicated or backed-up copies, recovering from any storage device failures, etc.
The typical mass storage system interacts with a variety of applications, each of which may have different characteristics and priorities in terms of access and I/O performance, besides availability, back up, recovery and archiving needs. This results in management complexity since each storage consuming application demands different performance factors of every application when analyzing and provisioning its storage.
A number of storage technologies have been developed and have emerged to address the above needs of storage systems and storage management. Networked storage systems, such as storage area networks (SANs) and network attached storage (NAS) and their associated storage management software, have addressed the requirements of scalability, availability, and performance, as well as managing the complexity of storage that affect the total cost of ownership (TCO) of storing information.
SANs (Storage Area Networks) are targeted at providing scalability and performance to storage infrastructures by creating a separate network to connect servers to storage devices (such as tape drives, automated tape libraries, and disk drive arrays) and transfer block-level data between servers and these devices. The primary advantage of a SAN is scalability of storage capacity and I/O without depending on the local area network (LAN), thereby improving application performance. SANs are generally based on the Fibre Channel (FC) protocol.
A NAS (Network Attached Storage) device sits on the LAN and is managed as a network device that serves files. Unlike SANs, NAS has no special networking requirements, which greatly reduces the complexity of implementing it. NAS is easy to implement but difficult to maintain when multiple devices are deployed, increasing management complexity.
Virtualization software and storage systems logically manage storage devices as if they were a single entity by abstracting the physical connections between the storage devices and the servers. When applied across multiple storage devices, virtualization not only provides simplified storage management through centralized control, but it also enables other functions to support applications such as providing transparent growth of storage independent of physical storage device limits. Other advantages include delivering fault tolerance, and creating transparent replication for data availability, and performance tuning transparent to the application or host computers. U.S. Pat. No. 4,467,421 to White, titled Virtual Storage System And Method granted on Aug. 21, 1984 and IEEE [IEEE94], the Mass Storage Systems Reference Model, Version 5, are two instances of how virtualization of storage can provide these and new storage management functions.
Virtual storage management can be implemented either in software in the host computer, in the external storage system or appliance as described in U.S. Pat. No. 4,467,421, or in an intelligent SAN switch. In terms of scalability, virtualization embedded in an external storage system or in the SAN switch offers a high degree of scalability and flexibility.
Storage Resource Management (SRM) software provide centralized monitoring, alerting, and reporting of the state of specified storage assets within the center. By monitoring, reporting, and providing event management of storage devices, typically, indicating whether predetermined thresholds are exceeded, SRM provides alerts for system administrators to analyze and take necessary actions. It also enables administrators and analysts in the data center to plan the efficient use of storage.
Storage Area Management (SAM) refers to the management of all storage devices and systems in a networked storage to meet storage management needs for host applications sharing storage across the network. Desired features that need support include: automated allocation or expansion of storage volume, the unit of storage composed of contiguous blocks, as needed by the host application; reallocation of storage in the event of discrete storage device (drive or array) failures; retuning or migration of physical storage data from one set of disks or arrays to another in the event of performance degradation or failure without disruption to the host applications; and maximizing the efficiency of storage capacity used across all storage devices.
While the need for scaling access to storage in a networked environment has lead to the growing adoption of SAN and NAS, there has been a greater need for storage management. The motivation for storage management is the lowering of the total cost of ownership (TCO) by increasing data availability, managing allocation of storage for applications, and assuring application performance as it depends on storage. In summary, new storage systems, especially scalable storage systems or appliances have to support: the centralization of storage resources, especially storage devices, using virtualization so that host applications are not burdened with the details of physical storage limitations; SRM functions of the networked storage devices or efficient storage planning and usage; and SAM functions in the event that the storage devices are part of a SAN infrastructure
Two basic architectures have been traditionally used to build storage systems. The first basic architecture is a Centralized Storage Controller Model; the central storage controller is shown in FIG. 1. Typical storage controllers in this model follow the RAID controller model described first in D. Patterson, G. Gibson, and R. Katz, “A Case for Redundant Array of Inexpensive Disks (RAID),” International Conference on Management of Data (SIGMOD), June 1988, where the controller module shields the storage users or host servers from the details of the physical disk drives, and makes it appear as a single reliable high-capacity disk drive. In most RAID configurations, application data from the server is distributed across a multiplicity of drives, via disk adapters, together with XOR parity redundancy data, for reliability reasons.
To improve performance of RAID WRITE operations that require parity computations, most storage controllers use a cache memory. In addition, the cache module also can improve read performance by caching data blocks that exhibit temporal locality, i.e., data blocks that are frequently accessed. Thus, the cache can be used to storage data blocks for both READ and WRITE operations.
In addition, the storage controller could use the cache memory to maintain information of how the data blocks accessed by the server are mapped to the segments on the physical drives. Typically, this requires the logical block addresses (LBAs) of storage allocated to applications in the server to be mapped to physical block addresses (PBAs) on physical disk drives. Such mapping of LBAs to PBAs, as well information of the state of the storage blocks being used, etc., constitutes storage metadata that is also maintained and continuously updated by the controller. Storage metadata for block data can also be cached in the cache memory to accelerate the performance of READ and WRITE operations.
A typical centralized approach is shown in FIG. 1 where a storage processor 4 that performs all the RAID function is shown attached to a shared data bus 8. Servers 1 attach to the storage processor 4 through host adapter interfaces 3, also referred to as host bus adapters, that execute storage connectivity protocols such as SCSI or Fibre Channel. A cache memory 7 used to cache both block data and storage metadata is also attached to bus 8.
We describe how a simple storage operation is executed on this architecture. Storage READ requests, for example, arrive at the storage processor 4, which references metadata information typically on the cache module. From the metadata, the storage processor 4 can determine both the physical location of the data blocks requested, as well as if the block is currently cached in the cache memory 7. If cached, the data blocks can then be read from the cache memory 7 and directed by the storage processor 4 to the appropriate host adapter 3 for the requesting server 1. If the data is not cached, the data block is read from the array of physical drives 6, i.e., the stripes of data are read from the drives in RAID set for the data block. FIG. 1 does not show specialized hardware such as XOR devices that can be used during WRITE to compute the parity stored in the RAID set. Access to the physical drive 6 is through the disk adapter 5 that communicates through the native disk interface, SCSI, IDE or Fibre Channel, for example.
The limitation in the performance of the controller of FIG. 1 is inherent in the serial computational model. The bottleneck is determined by the lowest performing entity, which could be any one of the shared data bus 8, the bandwidth of the cache memory 7, or the storage processor 4. While the single bus 8, single processor 4, and cache memory 7 are simpler to manage, the number of input output (I/O) operations per second (IOPs) as well as the data rate are severely limited.
The natural alternative to the single controller model is to build a distributed computing model. FIG. 2 shows one typical distributed computing model where multiple single storage controllers 10 are used to increase the number of parallel I/O storage operations. In this case, each controller has a local cache memory 7 for caching both data and metadata that could be requested by the servers 1 directly attached to the storage system through host adapters 3. Since the data block for the local attached server 1 can be located on any disk or array of disks 6 behind the switch 12 as illustrated in FIG. 2, the storage controller 10 and cache memory units 7 are interconnected by a data switch 12 that allows concurrent access to multiple disks or disk arrays 6. The architecture is effectively an MIMD (multiple instruction multiple data) parallel processing scheme.
FIG. 2 illustrates a distributed storage controller 10 and cache memories 7 as known in the prior art. The advantages of this architecture is that in an N-way (N>1) controller scheme there are some obvious scalability gains over the single controller, including: N times the storage processing; N times the total cache capacity; switched access from a storage controller to backend disk arrays increases the number of possible date paths and data rate. If an N×M switch is used, any of the N storage controllers can access the M disk arrays via M data paths.
While the system illustrated in FIG. 2 provides an increase in storage processing and data bandwidth, there is also a concomitant increase in the cost of the controller. The advantage in cost performance is in the amortized cost of the switch across N storage controllers and M disk arrays.
However, the distributed computing model suffers from the (metadata and data) cache synchronization problem. Since the logical storage blocks presented to the server 1 can be located in any of the backend disk arrays 6, updates and accesses to any of the disk arrays 6 will change metadata information, as well as the contents of the cache memories 7. The implication of cache synchronization is that cache data and metadata have to be updated frequently when there is any significant WRITE activity to the disk arrays 6. This implies that with low READ/WRITE ratios, cache synchronizations will require data and metadata to be moved across the switch 12 and across the storage controllers 10. This will result in performance penalties.
A key concern when considering either architecture is that it is difficult to readily scale performance without avoiding significantly increasing cost. There have been a number of variations on the distributed storage controller model that have attempted to improve performance as discussed herein below.
U.S. Pat. No. 5,720,028 describes a two (or multiple) controller storage system that share a common management memory for storing metadata for the storage controllers, each of which accesses this memory to monitor the operational states of other storage controllers. In a two-controller model, the controllers can be used in load balancing mode to improve performance as well as a redundant mode where the second controller is used as a backup in the case of a failure of the first. No explicit storage management functions are provided, and no efforts are made to optimize the cost for performance.
U.S. Pat. No. 5,819,054 describes a complex architecture using a bank of cache modules and cache switches interposed between host adapters and disk adapters. Multiple busses interconnect the adapters to the cache units and the cache switches. While this provides significant increase in the number of data paths and bandwidth as well as redundancy, there is a large number of interconnected components and associated cost.
A more recent alternate approach to scaling performance is the use of a central shared cache in a distributed storage controller model. U.S. Pat. No. 6,381,674 specifies a method that allows multiple storage controllers sharing access to multiple storage devices (disks/arrays) by using a centralized intelligent cache. The intelligent central cache provides substantial processing for storage management functions, such as RAID management functions, parity generation and checking, as well as RAID geometry (striping) management. The storage controllers, RAID controllers, transmit cache requests to the central cache controller. The central cache controller performs all operations related to storing supplied data in the cache memory as well as posting such cached data to the disk array as needed. The storage controllers are significantly simplified because they do not need to duplicate local cache memory on each storage controller. Furthermore, the penalty of inter-storage controller communication for synchronizing local caches is avoided. U.S. Pat. No. 6,381,674 provides a variation of using a pure distributed computing architecture to enable efficient processing storage by modifying the system of FIG. 2 is to aggregate all local caches into a central cache with RAID processing.
However, the manner in which a storage management functions, such as SRM and SAM described earlier, which are supported beyond RAID were not described in any of the previous approaches. We now examine how these features have been supported in other existing approaches.
There are several considerations for supporting storage management in a storage system, including: virtualization, creating local or remote data copies for increased data availability, implementing methods for storage placement for performance optimization, and retuning storage allocation for performance optimization. All of these aim to meet the requirements of scalability, increased I/O capabilities, downtime avoidance, automation of management, and managing (optimizing) performance.
U.S. Pat. No. 4,467,421 defines the first disclosed storage virtualization system that used intermediate storage processing to decompose user data into blocks of a size suitable for the target storage device, and allocates the blocks to physical locations, across multiple heterogeneous storage arrays or devices, without involvement of any host or user. The disclosed virtualization system also relies on cache data to improve performance in the access of data which is frequently accessed. U.S. Pat. No. 6,216,202 describes how multiple logical volumes presented to a host can be mapped to a storage system comprising multiple storage devices, and where the logical volumes span the storage devices. It also specifies how the hosts can access a logical volume via multiple paths for purposes of best performance and fail-over in case of path failures. No specific architecture is described to provide the most cost-effective means to support virtualization.
Data availability schemes are generally specified in terms of protocols between multiple storage systems. U.S. Pat. No. 5,544,347 specifies how a primary storage controller coordinates the copying of the primary data to the secondary storage controller, including the acknowledgement of the data copied and the level of synchronization reached. U.S. Pat. No. 6,397,229 discloses how a storage controller manages outboard incremental backup/restore of data, and specifies a storage-controller-managed outboard incremental backup/restore uses an indicator that specifies if a physical sector of the data storage device has been backed up or not.
Performance optimization techniques are done by careful selection of physical storage location at allocation time, or by reallocating physical storage, for the same logical storage, to optimize performance. A number of different yet common sense approaches have been used. U.S. Pat. No. 6,314,503 specifies a method for managing the placement of data in a storage system to achieve increased system performance by detecting and improving performance by such techniques as load balancing logical volumes hot spots by reallocating to less used physical volumes, etc. U.S. Pat. No. 5,956,750 specifies a particular method for reallocating logical to physical disk devices using a storage controller based on access frequency and sequential access ratio calculations. U.S. Pat. No. 6,209,059 specifies a particular method for the on-line reconfiguration of the logical volumes of a storage system by rearranging the request queue and redefining device queues within request queue of the storage controller to accommodate new devices or to recapture queue storage associated with deleted devices. U.S. Pat. No. 5,623,598 specifies a method of explicitly sampling a performance metric during operation of the storage system and using the performance history to determine how the storage system is performing, and exploring different factors (such as queue size, memory, upgrade disk, rebalance disks) that can improve performance.
While a number of different algorithmic and procedural techniques have been suggested, there has been no effort to create a cost-effective design of the architecture that would best support the management features discussed above.