The ability to store, process and transmit information is a facet of operations that businesses rely upon to conduct their day-to-day activities. For businesses that increasingly depend upon data and information for their operations, an inability to store data, an inability to process data, or an inability to transmit data can hurt a businesses' reputation and bottom line. Businesses are taking measure to improve their ability to store, process, and transmit data, and to more efficiently share resources that enable these operations, while minimizing the cost thereof.
To assist businesses in realizing these goals, the storage industry has produced a rich set of data storage options for storing online digital data, each with its own performance, availability, and cost characteristics. Businesses can keep capital costs of data storage low by choosing, for example, disk drives attached directly to server I/O buses, or they can opt for more elaborate and expensive disk arrays that provide packaging, power and cooling, and centralized management for dozens, or even hundreds of disk drives. Further, businesses can use RAID technology or storage virtualization to improve performance and/or reliability.
Storage virtualization technology (e.g., volume managers) partition, concatenate, stripe, mirror, etc., several physical disk drives, and present a resulting logical volume as though it were a single disk drive. A resulting logical volume will often have better performance and/or reliability characteristics when compared to individual underlying physical disk drives. Storage virtualization technology such as RAID also enhances performance characteristics of underlying physical disk drives. Storage virtualization technology can also aggregate logical volumes to create a higher level logical volume. A “storage object” can be any device, physical or logical, used to store data, including a physical disk drive, a logical volume, or an aggregate of logical volumes.
Storage virtualization technology can be implemented in several ways. For example, virtualization technology can be implemented in disk arrays, using microprocessors programmed specifically for the task. Another implementation of virtualization technology is in application servers, using volume management software. A further example of implementation of virtualization technology is in a storage network, using switch-resident virtualization software.
Classes of storage can be considered as a combination of hardware technology options (e.g., a collection of disk drives, a low end array of disks, or an enterprise array of disks), storage virtualization technology, and storage virtualization implementation techniques. Each combination has unique advantages and limitations. Storage options may be designed for fast access, high availability, rapid disaster recovery, low impact on resources during data backup, low cost or other factors.
An area of concern to businesses is efficient and cost effective management of data storage resources. Historically, storage resources such as disk drives have been physically located where computing resources have been located. When the bulk of computing resources for a business were centralized in a mainframe computing environment, the bulk of the data storage resources were centrally located in proximity to the mainframe. However, as businesses adopted a distributed computing model, data storage resources also tended to be distributed throughout the network.
Such a distribution of data storage resources creates several problems for information technology (IT) management. IT managers must maintain and support data storage resources physically located over a wide geographic area, thus causing costly expenditures in personnel and time to conduct such maintenance. Distribution of data resources among individual workgroups can also be inefficient and costly.
To avoid this problem, businesses have adopted storage virtualization methods that can present all network storage as a pool of storage. Such virtualization has been made possible, in part, by separating physical data storage resources from the computing resources, using network attached storage (NAS) or storage area networks (SAN) mechanisms. Once physical data storage resources are physically separate from computing resources (e.g., application severs and database servers), the physical data storage resources can be centrally located as a pool in one or in a few locations that are under direct IT management. Such centralization allows for more efficient support and maintenance of the physical data storage resources.
The shift in storage paradigm from directly-attached, physical disks to centrally-located, virtualized volumes provided over a network has also allowed for a shift in how file systems are maintained on the storage resources. In an environment in which a disk drive is directly attached to computing resources, a file system is formatted onto all or part of the physical disk drive. Typically, a file system is bounded by the storage memory space limits of the physical disk drive.
As discussed above, storage virtualization tools can partition, concatenate, stripe and/or mirror several physical disk drives, and present a resulting storage volume as a virtual volume. A virtualized volume is seen by an application server or database server as a single “device.” The application server or database server does not see the individual physical disk drives that comprise the virtualized volume. An application server creates a file system on a virtualized volume in the same manner that a file system would be created on an individual physical disk drive. The file system and its data are spread across all physical disks comprising the virtualized volume in a manner consistent with the virtualization technique being employed for that volume. From the file system's point of view, a virtual volume functions as a disk drive with a couple of extra features (e.g., online capacity expansion); thus, the file system treats all storage objects equally.
Additional flexibility in allocating storage can be provided by a file system that spans multiple volumes. In a multi-volume file system, each volume mounted by the file system can have its own properties (e.g., mirrored storage or RAID), thus allowing for data to be placed in an optimal storage type dependent upon the classification of data.
Volumes mounted by a multi-volume file system must be logically associated with each other so that operations and state changes that affect the file system (e.g., failure of a volume) are dealt with atomically. In one embodiment (provided by VERITAS' Storage Foundation Volume Manager), a “volume set” logical construct associates a multi-volume file system's volumes with one another. Such a volume set construct partially subsumes the volume's identities (e.g., a multi-volume file system is formatted on the volume set rather than on its individual member volumes), and partially leaves their identities intact (e.g., the file allocation and relocation policies that can direct files to individual volumes). In a multi-volume file system, files can be distributed randomly across the volumes or can have specified locations as set by rule. This intimate association of member volumes of a multi-volume file system results in a lack of flexibility when one desires to remove a volume from the multi-volume file system. Traditionally, in order to effect a removal of a volume from a multi-volume file system, the volume must be empty of all data before removing it from the file system.
On occasion, it can be desirable to extract a portion of a file system for subsequent use by another computer or set of computers or to be merged with another file system on another computer or set of computers or mounted independently on another computer system. Cross-platform data sharing works only at the level of a complete file system. Sometimes, only part of a file system needs to be exported to another machine. In such a case, there are no solutions for exporting a specific part of a file system (e.g., a directory) onto another machine, unless the entire file system is being copied. Copying a file system, or a directory structure within a file system, consumes processor resources, I/O capacity, and can also result in data unavailability while files and directories are being copied from one device to another.
It is desirable to have support for dismounting and mounting file systems or portions thereof on demand. It is also desirable to continue to have access to data being extracted from the file system with minimal unavailability of the data and with minimal impact upon processing. Further, it is desirable to be able to merge a local file system into a global file system and extract a part of a global file system as a local file system in an environment where multiple disk volumes can be shared by multiple computer resources in a wide geographic area. In a multi-device, multi-volume file system, it can be desirable to build a file system on devices including removable media, which can then be extracted from the multi-volume file system and still be useable independent from the multi-volume file system.