Along with the advance made in network technology in recent years, a new practice has become widespread. The practice is that storages are separated from computers and attached to networks, and access thereto is made through the networks. Such a storage attached to a network is referred to as network storage.
Typical examples of network storage are the SAN storage and NAS (Network Attached Storage). The SAN storage uses SAN (Storage Area Network). The SAN storage has high reliability and performance. However, the cost of SAN storage is extremely high and is mainly used by enterprising businesses. The NAS on the other hand uses the IP network which is most widespread. The performance of NAS is lower than SAN storage but the cost is lower than SAN and it is also easier to use.
In recent years, instead of expensive large-scale storages, an inexpensive, small-scale network storage has been introduce and thereafter added another one as required. However, this method involves a problem. If a new network storage is added to an existing system, data must be moved from the existing network storage to the new network storage. Also, clients (including computers) and the network storages must be reconnected with each other. As the result, the management cost of the system increases.
One of methods for reducing the management cost is visualization of network storages. This is a technique for virtualizing multiple network storages as a single storage unit for clients.
A number of methods have been developed for virtualizing multiple network storages. A method is disclosed at the website <www.maxtor.com/products/maxattach/products/applicationSpotlights/OTG_solutionsSpotlight.htm> (Document 1), for example. The method is that a control server called primary storage that also functions as a network storage unit manages the file location information in a centralized manner. In the method, a network storage in which a file is to be stored is determined by time when the file is accessed last. Newly created files are stored in primary storage, and files not accessed for a certain period of time are then moved to secondary storage. The primary storage receives a file access request from a client. If the file currently does not exist in the primary storage, the file of the secondary storage is accessed. Thus, the network storages look to the client as if they were a single unit.
Another method for virtualization is described in DiFFS: a Scalable Distributed File System, Christos Karamanolis et. al., HP Laboratories Palo Alto, HPL-2001-19, Jan. 24, 2001 (Document 2). The method is that files and directories are managed by logical volume basis. The logical volume identifiers are recorded in directory entries for managing directories and files. The directory entries are distributed and placed in individual logical volumes. Each network storage has a table that correlates between logical volume identifiers and the network storage identifiers for the storage locations thereof. The network storage specifies the network storage identifier which stores a file by the table and a directory entry concerned. When a new network storage is added to the system, the logical volume concerned is moved from the existing network storage to the new network storage. At this time, the mirroring function of LVM (Logical Volume Manager) which is a virtualizing technique is used.
A further method for virtualization is disclosed in U.S. Pat. No. 6,029,168 (Document 3). The method is that one file is partly distributed and placed in multiple network storages. The method involves file management information on the range and order of distribution in network storages in which files are located. If a new network storage is added, the file management information is updated. New files created after the update are placed in a new range of distribution. In this method, however, the file management information on existing files is not updated, and any existing file or any part thereof is not moved to the new network storage.
Japanese Patent Laid-Open No. H6(1994)-59982 (Document 4) discloses a control method for virtual storage in a computer. The method is that it is judged based on the free disk space in a high-speed external storage unit whether data should be moved to a low-speed external storage unit. The method involves a high-speed external storage unit faster than magnetic disks and a low-speed external magnetic disk storage unit which is slower but has a large capacity. If the free disk space in the high-speed external storage unit is reduced to a threshold or below, data is moved to the low-speed external storage unit. When the free disk space in the high-speed external storage unit exceeds the threshold, the data is returned from the low-speed external storage unit to the high-speed external storage unit. Thus, the two external storage unit look to the computer as if they were a single virtual storage.
In the method described in Document 1, a storage which stores a file is determined by time when the file is accessed last. Therefore, the occupied disk space of the primary storage and that of the secondary storage becomes steadily imbalance storage. In the method disclosed in U.S. Pat. No. 6,029,168, files are uniformly distributed to multiple network storages. Consequently, the occupied disk space is balanced between network storages which are added to the system around the same time. However, there is a steady imbalance in the occupied disk space between network storages which are added to the system at different times. This is because files are not moved between them. If a network storage is filled with capacity due to such a steady imbalance, files cannot be written even if there are some free disk spaces in the other network storages.
This problem can be solved by adding a function of leveling the disk usage rates of the individual network storages to the method described in non-patent Document 2. However, the free disk spaces in the individual network storages are uneven in a system wherein the disk spaces in the individual network storages are uneven even if the disk usage rates are equal. If a large file is written to a network storage lowest in free disk space here, the storage is filled with capacity, and files cannot be written.
In the method described in Document 2, access requests from clients are buffered while a logical volume is being moved. Therefore, if access requests from clients frequently occur during the logical volume migration, the buffer can become full. In case of the buffer full, the system cannot process access requests anymore, and then access seems to be stopped from clients.
The method disclosed in Document 3 provides a hint to solving the problem which is caused by writing large files in that attention is paid to the free disk space in the external storage unit. However, the method is predicated on a system comprising only two storages, high-speed external storage and low-speed external storage. The method as it is cannot be applied to a storage system composed of multiple network storages.
A first object of the present invention is to provide a method for rebalancing the free disk spaces in a network storage system virtualized into a single file system view with the disk spaces in the network storages thereof uneven, wherein a steady imbalance of the free disk spaces among the network storages is prevented so that clients can always use the system and even if client writes large files, a maximum quantity of data can be written to disks managed by the virtualized network storage system.
A second object of the present invention is to provide a method for rebalancing the free disk spaces in a network storage virtualized into a single file system view involving file migration between network storages thereof, wherein access requests from clients are not stopped while a file is being moved between network storages.