1. Field of the Invention
The present invention relates, in general, to mass data storage, and, more particularly, to software, systems and methods for accessing and managing virtualized data storage.
2. Relevant Background
Recent years have seen a proliferation of computers and storage subsystems. Demand for storage capacity grows by over seventy-five percent each year. Early computer systems relied heavily on direct-attached storage (DAS) consisting of one or more disk drives coupled to a system bus. More recently, network-attached storage (NAS) and storage area network (SAN) technology are used to provide storage with greater capacity, higher reliability, and higher availability. The present invention is directed primarily at NAS and SAN systems, collectively referred to as network storage systems, that are designed to provide shared data storage that is beyond the ability of a single host computer to efficiently manage.
To this end, mass data storage systems are implemented in networks or fabrics that provide means for communicating data with the storage systems. Host computers or servers are coupled to the network and configured with several disk drives that cumulatively provide more storage capacity or different storage functions (e.g., data protection) than could be implemented by a DAS system. For example, a server dedicated to data storage can provide various degrees of redundancy and mirroring to improve access performance, availability and reliability of stored data. A large storage system can be formed by collecting storage subsystems, where each sub-system is managed by a separate server.
However, because the physical storage disks are ultimately managed by particular servers to which they are directly attached, many of the limitations of DAS are ultimately present in conventional network storage systems. Specifically, a server has limits on how many drives it can manage as well as limits on the rate at which data can be read from and written to the physical disks that it manages. Accordingly, server-managed network storage provides distinct advantages over DAS, but continues to limit the flexibility and impose high management costs on mass storage implementation.
Some solutions provide a centralized control system that implemented management services in a dedicated server. The management services could then span across various sub-systems in the network storage system. However, centralized control creates bottlenecks and vulnerability to failure of the control mechanism. Accordingly, a need exists for a storage system management mechanism that enables storage management from arbitrary points within the storage system. Further, a need exists for storage management systems that implement management processes in a manner that is both distributed, yet capable of managing across multiple sub-systems in a network storage system.
A significant difficulty in providing storage is not in providing the quantity of storage, but in providing that storage capacity in a manner than enables ready, reliable access with simple interfaces. Large capacity, high availability, and high reliability storage architectures typically involve complex topologies of physical storage devices and controllers. By “large capacity” it is meant storage systems having greater capacity than a single mass storage device. High reliability and high availability storage systems refer to systems that spread data across multiple physical storage systems to ameliorate risk of data loss in the event of one or more physical storage failures. Both large capacity and high availability/high reliability systems are implemented, for example, by RAID (redundant array of independent drive) systems.
Storage management tasks, which often fall on an information technology (IT) staff, often extend across multiple systems, multiple rooms within a site, and multiple sites. This physical distribution and interconnection of servers and storage subsystems is complex and expensive to deploy, maintain and manage. Essential tasks such as adding capacity, removing capacity, as well as backing up and restoring data are often difficult and leave the computer system vulnerable to lengthy outages. Moreover, configuring and applying data protection, such as RAID protection, to storage is complex and cannot be readily changed once configured.
Storage virtualization generally refers to systems that provide transparent abstraction of storage at the block level. In essence, virtualization separates out logical data access from physical data access, allowing users to create virtual disks from pools of storage that are allocated to network-coupled hosts as logical storage when needed. Virtual storage eliminates the physical one-to-one relationship between servers and storage devices. The physical disk devices and distribution of storage capacity become transparent to servers and applications.
Virtualization can be implemented at various levels within a SAN environment. These levels can be used together or independently to maximize the benefits to users. At the server level, virtualization can be implemented through software residing on the server that causes the server to behave as if it is in communication with a device type even though it is actually communicating with a virtual disk. Server-based virtualization has limited interoperability with hardware or software components. As an example of server-based storage virtualization, Compaq offers the Compaq SANworks™ Virtual Replicator.
Compaq VersaStor™ technology is an example of fabric-level virtualization. In Fabric-level virtualization, a virtualizing controller is coupled to the SAN fabric such that storage requests made by any host are handled by the controller. The controller maps requests to physical devices coupled to the fabric. Virtualization at the fabric level has advantages of greater interoperability, but is, by itself, an incomplete solution for virtualized storage. The virtualizing controller must continue to deal with the physical storage resources at a drive level.
Copending U.S. patent application Ser. No. 10/040,194, filed on even date herewith and assigned to the assignee of the present invention is incorporated herein by reference. This application describes a system-level storage virtualization system that implements highly fluid techniques for binding virtual address space to physical storage capacity. This level of virtualization creates new difficulties in storage management.
Interfaces to storage management systems reflect this complexity. From a user perspective, application and operating software express requests for storage access in terms of logical block addresses (LBAs). The software and operating system requests, however, are not valid for the individual physical drives that make up network storage system. A controller receives the requests and translates them to disk commands expressed in terms of addresses that are valid on the disk drives themselves. The user interface implemented by prior network storage controllers implemented this logical-to-physical mapping, and so required knowledge of the physical disks that make up the storage system. As physical disks were added to, removed from, failed, or otherwise became inaccessible, the logical-to-physical mapping had to be updated to reflect the changes. A need exists, therefore, for a management system for a truly virtualized mass storage systems in which a user interfaces only to virtual entities and is isolated from the underlying physical storage implementation.