1. Field of the Invention
The present invention relates generally to data storage systems, and more particularly to striping of data across a group of data storage devices in the data storage system.
2. Description of the Related Art
For superior performance and load balancing, it is known to stripe data for a data set across a group of data storage devices such as disk drives. Such striping permits access to the data set at a high, sustained throughput by simultaneously accessing multiple disk drives. This is particularly desirable for streaming backup data and for isochronous access to continues media datasets for video-on-demand applications, for example, as described in Duso et al. U.S. Pat. No. 5,892,915, issued Apr. 6, 1999.
Data storage systems are often sold with a storage capacity less than the maximum capacity that can be installed. The customer can install additional storage adapters and storage devices on an as-needed basis until the maximum capacity is reached. Once the storage is added, the customer may expand existing file systems to permit storage of new files in the file systems in the newly added storage. File system implementations including UFS (the conventional Unix File System) and VxFS (a Video File System) allow offline expansion. VxFS describes the expansion of a file system while it is on-line, but this on-line expansion is feasible only when a single host accesses the file system. Moreover, neither UFS nor VxFS allows the expansion of striped file systems.
When storage is added to a system having a relatively small number of data storage devices, it is often possible to increase substantially the sustained throughput for access to a striped dataset by reorganizing the striped dataset so that it becomes striped across the newly added data storage devices as well as the initially installed data storage devices. Such a reorganization has been achieved by backing any existing data onto secondary storage (such as tape), rebuilding a new stripe volume, and restoring the content from the backup. However, in situations where a secondary backup system is not readily available, storage expansion must be facilitated by reorganizing content on the fly. The expansion scheme must also ensure that the original data is not corrupted during the reorganization.
The present invention provides a method of operating a data storage system for on-line expansion of a file system so that the file system uses additional data storage added to original data storage of the data storage system. The data of the file system resides in the original data storage prior to the expansion. The method includes reorganizing at least a portion of the data of the file system by moving some of the data of the file system from the original data storage to the additional data storage so that the data of the file system is distributed over the original data storage and the additional data storage. The method further includes repetitively updating metadata of the file system during the movement of data of the file system to permit a host processor to access concurrently the data of the file system during the reorganization of the file system.
In accordance with another aspect, the invention provides a method of reorganizing striped data initially striped over original storage devices. The data is reorganized by striping over an array of storage devices including the original storage devices and additional storage. An order of striping of the data over the original data storage prior to the reorganization is not preserved during the striping of the data across the array of storage devices. The method includes sequentially moving data from storage locations in the original storage devices to temporary storage, and from the temporary storage to storage locations in the array of storage devices, until a pivot point is reached. After the pivot point is reached, data is moved data sequentially from storage locations in the original storage devices to storage locations in the array of storage devices, without using temporary storage for the data that is moved.
In accordance with yet another aspect, the invention provides a method of reorganizing striped data initially striped over original storage devices. The data is reorganized by striping over arrays of storage devices including the original storage devices and additional storage. The method includes receiving specifications of orders of striping of data over arrays of storage devices. Upon receiving each specification of an order of striping of data over an array of storage devices, the specification is inspected to determine whether or not an order of striping of data over the original storage devices is preserved in the specified order of striping of data over the array of storage devices. When an order of striping of data over the original storage devices is preserved in the specified order of striping of data over the array of storage devices, a linear reorganization is performed by sequentially moving blocks of data from original storage locations in the original storage devices to new storage locations in the array data storage devices. When an order of striping of data over the original storage devices is not preserved in the specified order of striping of data over the array of storage devices, a random reorganization is performed by sequentially moving stripes of the blocks of data from original storage locations to a set of temporary storage locations, and from the set of temporary storage locations to storage locations in the array of storage devices.
In accordance with still another aspect, the invention provides a method of permitting a processor to access striped data of a file while the striped data is being reorganizing. The striped data of the file is initially striped over original storage devices. The data is reorganized by striping over an array of storage devices including the original storage devices and additional storage. The method includes reorganizing the striped data of the file by sequentially moving blocks of data from respective storage locations in the original storage devices to respective storage locations in the array of storage devices. There is an increasing offset between the storage location from which each block is read and the storage location from which said each block is written. The method further includes repetitively updating metadata of the file at a decreasing rate as the offset increases between the storage location from which said each block is read and the storage location from which said each block is written. Therefore, the processor concurrently accesses the striped data of the file by accessing the metatdata of the file to determine a current storage location of the file data to be accessed.
In accordance with still another aspect, the invention provides a data storage system. The data storage system includes original data storage containing a file system, and a data processor for accessing the file system in the data storage. The data processor is programmed for on-line expansion of the file system so that the file system may use additional data storage added to the original data storage. The data processor is programmed to perform the on-line expansion by moving some of the data of the file system from the original data storage to the additional data storage so that the data of the file system is distributed over the original data storage and the additional data storage, and repetitively updating metadata of the file system during the movement of data of the file system to permit a host processor to access concurrently the data of the file system during the reorganization of the file system.
In accordance with a final aspect, the invention provides a program storage device containing a program for a data processor of a data storage system. The program is executable by the data processor for on-line expansion of a file system so that the file system may use additional data storage added to original data storage. The program is executable by the data processor to perform the on-line expansion by moving some of the data of the file system from the original data storage to the additional data storage so that the data of the file system is distributed over the original data storage and the additional data storage, and repetitively updating metadata of the file system during the movement of data of the file system to permit a host processor to access concurrently the data of the file system during the reorganization of the file system.