The present invention relates to network storage systems and, more particularly, to systems and methods for converting existing traditional storage volumes in network storage systems into flexible storage volumes, without disrupting the network storage systems.
A network storage system typically includes one or more specialized computers (variously referred to as file servers, storage servers, storage appliances or the like, and collectively hereinafter referred to as “filers”). Each filer is connected to one or more storage devices, such as via a storage network or fabric. Exemplary storage devices include individual disk drives, groups of such disks and redundant arrays of independent (or inexpensive) disks (RAID groups). The filer is also connected via a computer network to one or more clients, such as computer workstations, application servers or other computers. Software in the filers and other software in the clients cooperate to make the storage devices, or groups thereof, appear to users of the workstations and to application programs being executed by the application servers, etc., as though the storage devices were locally connected to the clients.
Centralized data storage (such as network storage) enables data stored on the storage devices to be shared by many clients scattered remotely throughout an organization. Network storage also enables information systems (IS) departments to store data on highly reliable and sometimes redundant equipment, so the data remain available, especially in the event of a catastrophic failure of one or more of the storage devices. Network storage also facilitates making frequent backup copies of the data and providing access to backed-up data, when necessary.
Filers can also perform other services that are not visible to the clients. For example, a filer can treat all the storage space in a group of storage devices as an “aggregate.” The filer can then treat a subset of the storage space in the aggregate as a “volume.” Each data block of a storage device has an associated block number (a “disk block number” (DBN)), which serves as an address of the block on the storage device. Thus, each storage device can be thought of as providing an address space of blocks extending from DBN 0 (zero) to DBN (N−1), where the disk has N blocks. The address space of a volume consists of a concatenation of the address spaces of the portions of the aggregate that make up the volume. The blocks of this concatenated address space are consecutively numbered 0 (zero) to (M−1), and these numbers are referred to as “volume block numbers” (VBNs).
Filers include software that enables clients to treat each volume as though it were a single disk. The clients issue input/output (I/O) commands to read data from or write data to the volume. The filer accepts these I/O commands; ascertains which storage device(s) are involved; calculates appropriate DBNs; issues I/O commands to the appropriate storage device(s) of the volume to fetch or store the data; and returns status information (and, in the case of a read command, data) to the clients.
Some filers also implement “flexible volumes.” In contrast with traditional volumes (described above), a flexible volume is implemented as a file (a “container file”) on a volume or in an aggregate. Flexible volumes provide several advantages over traditional volumes. For example, storage space on the set of storage devices need not be pre-allocated to the container file. Although each flexible volume is created with a specified size, this size merely indicates a potential storage capacity of the flexible volume. Actual physical blocks on the storage devices are not allocated to the flexible volume until they are needed. For example, when the filer flushes its cache of modified (“dirty”) blocks, actual disk blocks are allocated to the flexible volume, up to the specified size of the flexible volume.
Another advantage of flexible volumes lies in the fact that the size of a flexible volume can be increased. Again, storage space on the storage devices is not necessarily allocated to the flexible volume to support the increased container file size until the storage space is actually needed. Furthermore, additional storage devices can be added to the set of storage device on which the container file is stored, thus increasing the potential maximum size to which the container file can be extended. Thus, flexible volumes provide greater flexibility and scalability than traditional volumes.
Filers enable clients to treat flexible volumes in the same way the clients treat traditional volumes, i.e., the clients can treat each flexible volume as though it were a single disk and issue I/O commands to the flexible volume, and each flexible volume presents an address space of numbered (addressed) blocks. Because physical storage space is not necessarily allocated to all the blocks of a flexible volume, the flexible volume is a virtual entity. That is, from the perspective of the clients, a flexible volume exists as a single disk drive. However, the filer creates this illusion, thus the block numbers of a flexible volume are referred to as “virtual volume block numbers” (VVBNs).
When a client issues an I/O command to read data from or write data to a flexible volume, the filer accepts the I/O command. The filer then issues one or more I/O commands to the appropriate block(s) of the container file to fetch or store the data. That is, the filer calculates the “physical volume block numbers” (PVBNs) on the storage devices that correspond to the VVBNs of the flexible volume that were referenced by the I/O command. The filer issues one or more I/O commands to the underlying storage devices and then returns status information (and, in the case of a read command, the requested data) to the client.
Due to the way a container file is typically distributed across its underlying storage devices, I/O performance when accessing data on a flexible volume can be better than if the data access requests were directed to a traditional volume.
The recognized advantages of flexible volumes over traditional volumes have sparked interest in converting existing traditional volumes into flexible volumes. However, existing methods of converting traditional volumes to flexible volumes have attendant problems. For example, a “snapshot” is a read-only, space-conservative version of an active file system at a given instant, i.e., when the snapshot is created as a persistent point-in-time image. A “brute force” conversion method copies all data from a single traditional volume snapshot into a flexible volume in a new aggregate having at least the same size as the traditional volume. However, this technique has several shortcomings. First, only a single snapshot of data is preserved. Second, because the new aggregate must be the same size as, or larger than, the traditional volume, this technique doubles the number of disks required during the conversion. Finally, the copying process is disruptive, because no modifications to the data can be made while the copying operation is taking place.
A second method of converting traditional volumes to flexible volumes involves copying traditional volume snapshots on a per Qtree basis. A Qtree is an entire volume or a subset of a volume that can be treated somewhat like a volume. Once each snapshot has been copied, a snapshot is taken in each new flexible volume. The advantages of such a method include preserving all snapshots. In addition, each Qtree in the traditional volume becomes an independent flexible volume in the new aggregate. However, this method also requires twice the number of disks, and the process is disruptive, because no modifications to the data can be made while the copying operation is taking place.
A third method of converting traditional volumes to flexible volumes involves an inode-by-inode copying of a traditional volume into dual volume block number (VBN) buffer trees in the flexible volume. (U.S. Pat. No. 5,819,292 to Hitz, et al., which is incorporated in its entirety herein by reference, describes various embodiments of the operational association between inodes (index nodes), buffer trees, indirect buffers, direct buffers, data blocks, and the like.) Once the dual VBN buffer trees have been created in the flexible volume, the data in the traditional volume are converted into the flexible volume in the aggregate. The third method requires fewer additional storage disks than the first two approaches, thereby reducing cost. However, the conversion is more complex and no modifications to the data can be made during the copying operation.
Therefore, methods of converting existing traditional volumes into flexible volumes, while preserving snapshots, and without requiring a large number of additional disk, would be desirable.