1. Field of the Invention
The present invention relates to a method for forming a virtual network storage by uniting a plurality of network storages into one. More particularly, the invention relates to a method for forming a virtual network storage, which transfers a request from a client to each of the plurality of network storages, combines responses from those network storages into one response and then send the response to the client.
2. Description of Related Art
Each of the conventional information processing systems has stored information in a storage connected directly to a computer system. The information stored in the storage is accessed only from the computer system. All other computer systems, when accessing the information in the storage, are required to access the information through the directly connected computer system. In recent years, however, according to the progress of the network technique and the rapid increase of an amount of information to be stored, information storages come to be separated from information processing computer systems. Under such circumstances, there has appeared a storage system that enables information to be shared among a plurality of computer systems connected to a network. A storage connected to a network in such a way is referred to as a network storage.
Examples of the network storage are, for one, the storage area network (SAN) storage connected through a SAN to enable block accesses, the network attached storage (NAS) connected through an IP network, the Infiniband, or the like to enable file accesses and the Web storage that enables accesses with use of the HTTP protocol which is an interface for Web accesses, as well as another protocol expanded from the HTTP protocol.
As the network storages come into wide use, each system manager comes to be required to manage a plurality of network storages connected to a network. For example, when an amount of data to be stored exceeds the maximum capacity of an existing network storage, the system manager rebuilds the system so as to add new network storages to the system and assure the rearrangement of data. This has resulted in a significant increase of the system management cost.
In order to suppress the system management cost, a technique for forming a virtual storage is indispensable. The technique unites a plurality of network storages so that those storages virtually look like one network storage from the computer systems, thereby addition of any new storages does not affect the whole system. There have been developed and proposed many methods for forming such a virtual storage system.
The Zebra System (The Zebra Striped Network File System, Harman et. al., ACM Transactions on Computer System, vol.13, No.3, 1995, pp.274–310) disclosed by Harman et. al in 1995 is one of those methods. According to this method, one file is distributed to and stored in a plurality of network storages. Concretely, a file is striped into units of a certain length and stored in a plurality of network storages sequentially with use of the round robin method. The Zebra System employs a centralized management server that controls the order for storing the striped and distributed blocks of a file in those network storages. A computer system, before accessing the file, inquires the management server of stored file block locations. In other words, a file is distributed and stored in a plurality of servers and one server manages those servers exclusively. In this way, a plurality of network storages look like one network storage virtually.
Alexander H. Fray et. al have also proposed another method (U.S. Pat. No. 6,029,168, Filed: 1998 Jan. 23) for striping a file with use of a non-centralized management server, which is different from the Zebra centralized management server. According to this method, starter node information in which file striping information is stored is embedded in the file identifier, thereby the striping information managed exclusively by the Zebra is distributed in and managed by a plurality of network storages. At this time, because the starter node is stored in the file identifier, the starter node is accessed first to determine the file block locations before the access to the file. An access request is transferred to the server that stores target file blocks as needed to process the file data. The client is just requested to issue the request to the starter node to access the target file. The client is not requested to take any consideration to the file block locations generated by the file striping.
However, the Zebra System requires the client to inquire the centralized management server of the network storages that store target file blocks before accessing the file. Especially, when in updating a file, the system is required to update not only the data in the network storages, but also the data in the centralized management server. Consequently, when the number of network storages increases, the centralized management server becomes a bottleneck that might hinder the improvement of the system scalability.
The method proposed by Alexander et. al, which uses a non-centralized management server, solves the problem of the Zebra System successfully, since the management server is shared by a plurality of servers. In spite of this, the method premises that such a server as a distributed directory server, etc. converts a file name or directory name used to identify a target file to access uniquely to a file identifier in which the starter node is embedded. In this connection, information for managing the correspondence between the server (starter node) that manages file locations and the file name is also required. In the embodiment disclosed in the U.S. Pat. No. 6,029,168, a “well-known” distributed directory file is used as means for storing the information. Because the location of the starter node is embedded directly in the file identifier, the information denoting the correspondence between the file name stored in the distributed directory and the file identifier must be updated when the starter node of an existing file is moved to the new network storage as a result of addition of a new network storage.
In addition, each of those methods that employ the conventional techniques uses only dedicated network storages to form a virtual network storage; it cannot use different types of network storages to form a virtual network storage. In other words, none of the conventional methods can unite different types of network storages into one virtual network storage to reduce the system management cost nor reduce the operation cost of existing network storages. This has been a problem of the conventional techniques.
According to any of the conventional methods, when a management server is used, both of file management information and data body are stored in two different servers. Accesses to a file are thus disabled when an error occurs in one of the servers. This has been another problem of the conventional methods.