In recent years, a distributed object storage system made by connecting multiple servers via a network has been realized in order to accumulate a large amount of data.
In general, such distributed object storage generates an index for an object while the object is written.
The object referred to herein includes a combination of data and meta data accompanying the data. The index is made by collecting meta data of each object and holding the collected meta data in a data structure suitable for searching.
FIG. 9 is a figure schematically illustrating a configuration of a conventional distributed object storage system.
The distributed object storage system 500 includes a proxy server 501 and multiple servers 510. In the example as illustrated in FIG. 9, a reference numeral representing a server is a reference numeral 510 when it means any one of the servers, but when it is to identify a particular server of the multiple servers, reference numerals A to F are used to indicate the servers.
The proxy server 501 is connected to multiple servers 510 via a network, not illustrated, and is also connected to a client computer, not illustrated. The proxy server 501 performs data access to the server 510 on behalf of the client computer. The proxy server 501 is an information processing apparatus such as a computer having a server function.
The proxy server 501 has management information which is structured by associating the storage position of a data file with information for identifying the data file. When the proxy server 501 receives a read/write request from a client to a data file, the proxy server 501 refers to the management information on the basis of the received file name, and accesses the data file of the access target.
The proxy server 501 causes the server 510 to write an object, and causes any one of the servers 510 to generate an index.
For example, the proxy server 501 transmits an object (meta data and data) to the server A to cause the server A to write the object, and transmits meta data of the object to the server B to cause the server B to generate an index.
Subsequently, the server A which has written the object in accordance with the command given by the proxy server 501 transmits the object to the server C, and commands the server C to write the object. As a result, redundancy of the object is realized. Likewise, the server C which has written the object transmits the object to the server E and commands the server E to write the object. These server A, server C, and server E storing the objects may be referred to as object processing servers.
On the other hand, the server B which has received a generation command of an index from the proxy server 501 uses the meta data received together with the generation command of the index to generate the index. Subsequently, the server B which has generated the index transmits the meta data to the server D to command the server D to generate an index. As a result, redundancy of the index is realized. The server D which has generated the index transmits the meta data to the server F to command the server F to generate an index. These server B, server D, and server F generating the indexes may be referred to as index generation servers.
FIG. 10 is a sequence diagram illustrating processing during writing of objects in a conventional distributed object storage system.
After the object is written to the server A in accordance with the command given by the proxy server 501, an object is written to the server C in accordance with the command given by the server A. Thereafter, the object is written to the server E in accordance with a command given by the server C.
On the other hand, in the server B, an index is generated in accordance with a command given by the proxy server 501, and thereafter, the server D generates an index in accordance with the command given by the server B. Thereafter, the server F generates an index in accordance with the command given by the server D.
However, in the conventional distributed object storage system as described above, when failure occurs in any one of the processing, there may be a case where it is not possible to maintain consistency between “an object is accumulated” and “an index for the object is generated”.
FIG. 11 is a sequence diagram illustrating processing during writing of objects in a conventional distributed object storage system, and indicates the state when the proxy server 501 fails.
In the example as illustrated in FIG. 11, the proxy server 501 fails after the proxy server 501 gives the server A a write command of an object but before the proxy server 501 gives the server B a generation command of an index.
As a result, each of the servers A, C, E accumulates the objects, but the servers B, D, F do not generate any index. More specifically, this causes inconsistency between the accumulate state of the object and the index.
FIG. 12 is a sequence diagram illustrating processing during writing objects in a conventional distributed object storage system, and illustrates an example where the server A fails.
In the example as illustrated in FIG. 12, the server A fails after the proxy server 501 gives the server A a write command of an object. The proxy server 501 gives the server B a generation command of an index.
As a result, each of the servers B, D, F generates an index, but because the server A fails, the servers C, E do not perform writing of objects. More specifically, in this case, there is also inconsistency between the accumulate state of the object and the index.
In order to prevent occurrence of such inconsistency between the accumulate state of the object and the index in the distributed object storage system, each server that has done writing of an object may transmit a generation command of an index.
FIG. 13 is a figure illustrating processing during writing of objects in an improved example of a conventional distributed object storage system.
In the example as illustrated in FIG. 13, a server A, a server C, and a server E which have written objects in a conventional distributed object storage system 500 as illustrated in FIG. 9 transmit meta data of the object to the server B to cause the server B to generate the index.
As a result, even if the proxy server 501 fails after the proxy server 501 gives the server A a write command of an object but before the proxy server 501 gives the server B a generation command of an index, the index is generated. More specifically, in the distributed object storage system 500, the consistency between the accumulate state of the object and the index can be maintained.
The server A which has received the write command of the object from the proxy server 501 transmits the write command of the object to the server C, and thereafter gives the server B a generation command of an index.
As a result, when the server A fails before the server A transmits the write command of the object to the server C, neither the object is written nor the index is generated in the distributed object storage system 500, and therefore, the consistency between the accumulate state of the object and the index is maintained. When the server A fails after the server A transmits the write command of the object to the server C but before the server A transmits the generation command of the index to the server B, the server C and the server E which have written the object gives the server B the generation command of the index. As a result, the consistency between the accumulate state of the object and the index is maintained.
It should be noted that the server B which has received the generation command of the index uses the meta data received together with the generation command of the index to generate the index.
Subsequently, the server B which has generated the index transmits the meta data to the server D to command the server D to generate the index, and further, the server D which has generated the index commands the server F to generate the index.
[Patent Literature 1] Japanese Laid-open Patent Publication No. 2005-321922
[Patent Literature 2] Japanese Laid-open Patent Publication No. 2004-342042
In the conventional distributed object storage system as illustrated in FIG. 13, however, each of the index servers B, D, F generates the index in accordance with the generation command of the index received from the object processing server. More specifically, the index servers B, D, F perform redundant index generation processing, and therefore, there is a problem in that unnecessary index processing is done, which is inefficient.