The present invention relates generally to storage systems and, more particularly, to a network file sharing system with storage thin provisioning management.
Block storage systems utilizing thin-provisioning can reduce the storage cost. Thin provisioning provides virtual volumes whose capacity is virtualized. Initially, a virtual volume has no allocated real storage area. Once data write reaches to the virtual volume, the storage system allocates a region of real storage area to hold the data in its region. A unit of allocating a region of real storage area is called a page. Thin provisioning prevents the allocation of a page to the region when it is not used in the virtual volume. A user can define a volume with a sufficiently large size for future use without having the whole amount of real storage device. A file storage system provides file sharing and storing service over the network. It manages files. A file has a name and data and properties.
A file storage system can use thin provisioning as its storage. The file storage system defines a file system on a virtual volume. A new file sharing protocol known as pNFS is standardized by RFC5661 and its block layout by RFC5663. A file storage system is divided to two portions. One portion is metadata management portion; the other is storage portion. The metadata management portion manages a file namespace and each file metadata including the data layout in the storage at the MDS (MetaData Server). Storage can be multiple and works parallel to enhance the performance of data transmission. With real storages, it is possible to accelerate data transmission by partitioning large data to store them as stripes to multiple storages. When a client reads the data, the client issues read requests to storages simultaneously to access data at once. The data transfer rate is multiple times of a single storage.
However, it is difficult to use a thin provisioning system for a file storage system that shares the pool volume among the virtual volumes. The thin provisioning system is targeted to reduce the storage cost, and its page allocation mechanism is focused on how cost efficient it is to the system. It is not aware that two pages can be read simultaneously. Once the allocation of pages of virtual volumes belongs to one LU, the data will be read sequentially even if the client requests data to be read in parallel.
US200810126734A1 discloses a storage extent allocation method for thin provisioning storage. This shows how to allocate a page from multiple RAID groups to improve page data access performance. In the parallel NFS, a client requests a file read to the metadata server. The metadata server returns the layout of the file requested to the client. The client will issue a read request to the volumes designated by the layout. This mechanism allows the client to read or write data to multiple volumes in parallel as one file. Large file read and write performance will be accelerated by the parallel data processing.
As an example of data creation in the case of parallel computing, each node computes a part of result data and writes out the result data to the part of the result file. The timing of write out of each node may be at random. Thus, the order of the parts of the file is not sequential. The layout of the file will be striped among some thin provisioning virtual volumes sharing a storage pool. Then, the thin provisioning system allocates pages to the virtual volumes on demand. The allocated pages are not aware of the file layout. Each node will read the whole file as a source data to compute the next result data, and the nodes will issue read requests in parallel to the virtual volumes. The pages of the requested volumes to be read may have been allocated in the same physical storage device, and the storage device will have to read one page at a time. In that case, the read performance is limited to the performance of the physical storage device and is the same as a sequential read request even if the storage system has enough amounts of physical storage devices and physical storage device control interfaces to read data in parallel.