In recent years, for example, systems that implement eventual consistency are referred to as distributed processing systems. The concept of the eventual consistency is that although data may have inconsistencies at one point in time, the data may become consistent eventually.
FIG. 1 illustrates one example of a distributed processing system that implements eventual consistency. In the example of FIG. 1, the distributed processing system includes nodes A to D that serve to execute processing. Each node is provided with a distributed processing framework (hereinafter simply referred to as a “framework”) for achieving eventual consistency. In the example of FIG. 1, the nodes A to C are adapted to hold replicas of data. For example, when the node A receives a command (a set command) for setting a data value “a”, it determines a node that is to serve as a storage location of the data, by using a distributed hash table (DHT) or the like. For example, when the nodes A to C are determined as storage locations, the node A stores the data value “a” in its own database and also transfers the data value “a” to the nodes B and C. As a result, the data value “a” propagates to the nodes B and C through cooperation of the frameworks and is stored in the databases of the nodes B and C. Arrows indicated by the solid lines in FIG. 1 represent flows of data transferred in response to the set command. A node that has received a command may be referred to as a “receptor”. A node that serves to store data may be called a “container”.
After the data value “a” is stored in the databases of the nodes B and C, when the node D receives a command (a get command) for obtaining the data, the frameworks cooperate with each other to obtain the data value “a” from any of the databases of the nodes A to C in which the data is stored and to output the obtained data value “a” to a request source of the get command. FIG. 1 illustrates an example in which the data is obtained from the database of the node C. Arrows indicated by the long dashed double-short dashed line in FIG. 1 represent a flow of the data transferred in response to the get command.
For example, in a state in which the data value “a” is stored in the nodes A to C, when the node A further receives a set command (for a data value “b”), the node A first rewrites the data value “a” stored in its own database to “b”, as illustrated in FIG. 2. In addition, the data value “b” propagates to the nodes B and C through cooperation of the frameworks. In this case, for example, when the node D receives a get command for obtaining the data value “b” from the node C before it propagates to the node C, the node D obtains the data value “a” since it is not rewritten to the data value “b” yet. After a certain period of time passes, attempting to re-obtain the data makes it possible to obtain the updated data value “b”.
As described above, in the distributed processing system that implements eventual consistency, there is also a case in which an updated data value cannot be obtained at a certain point in time, but, after a certain period of time passes, the update data value can be obtained unless another data update is performed. In the distributed processing system that implements eventual consistency, since the databases are not locked during data update, for example, the scalability of the system can be enhanced.
In the distributed processing system, a scheme (e.g., Lamport algorithm) in which a logical clock is used to represent the order relationship of processing between the nodes has been known. For example, as illustrated in FIG. 3, a transmitting node attaches, as a time stamp, a logical clock value at the time of transmission to a message and transmits the resulting message and a receiving node determines, as a new logical clock value, a value obtained by adding a predetermined number (“1” in FIG. 3) to the time stamp attached to the message. Thus, with the logical clock, time just proceeds and is not reversed (i.e., the logical clock value just increases and does not decrease). In FIG. 3, a numeric value at the starting point of each arrow represents the logical clock value of the transmitting node and a numeric value of the end point of each arrow represents the logical clock value of the receiving node. A numeric value indicated above each arrow represents the timestamp set by the transmitting node (i.e., the logical clock value at the time of transmission).
For example, in FIG. 3, when an event occurs at the node A and the logical clock value of the node A is 1, a message (a timestamp indicating 1) is transmitted from the node A to the node B. The logical clock value at the node B is 0 before reception of the message. After reception of the message, however, the node B determines that the logical clock value has increased to 1 since the timestamp included in the received message is 1 and thus uses, as a new logical clock value, a value (=2) obtained by adding 1 to the timestamp. Subsequently, when an event occurs at the node A and the logical clock value of the node A is 2, a message (a timestamp indicating 2) is transmitted from the node A to the node C. The logical clock value at the node C is 0 before reception of the message. After reception of the message, however, the node C determines that the logical clock value has increased to 2 since the timestamp included in the received message is 2 and thus uses, as a new logical clock value, a value (=3) obtained by adding 1 to the timestamp. Subsequently, when an event occurs at the node C and the logical clock value of the node C is 4, a message (a timestamp indicating 4) is transmitted from the node C to the node D. The logical clock value at the node D is 0 before reception of the message. After reception of the message, however, the node D determines that the logical clock value has increased to 4 since the timestamp included in the received message is 4 and thus uses, as a new logical clock value, a value (=5) obtained by adding 1 to the timestamp. When an event occurs at the node B and the logical clock value of the node B is 3, a message (a timestamp indicating 3) is transmitted from the node B to the node C. Although the timestamp included in the message received by the node C is 3, the logical clock value of the node C has increased to 4. Thus, the node C uses, as a new logical clock value, a value (=5) obtained by adding 1 to the logical clock value of the node C. Subsequently, when an event occurs at the node D and the logical clock value of the node D is 6, a message (a timestamp indicating 6) is transmitted from the node D to the node A. The logical clock value at the node A is 2 before reception of the message. After reception of the message, however, the node A determines that the logical clock value has increased to 6 since the timestamp included in the received message is 6 and thus uses, as a new logical clock value, a value (=7) obtained by adding 1 to the timestamp. Subsequently, when an event occurs at the node A and the logical clock value of the node A is 8, a message (a timestamp indicating 8) is transmitted from the node A to the node C. The logical clock value at the node C is 5 before reception of the message. After reception of the message, however, the node C determines that the logical clock value has increased to 8 since the timestamp included in the received message is 8 and thus uses, as a new logical clock value, a value (=9) obtained by adding 1 to the timestamp. When an event occurs at the node D and the logical clock value of the node D is 7, a message (a timestamp indicating 7) is transmitted from the node D to the node C. Although the timestamp included in the message received by the node C is 7, the logical clock value of the node C has increased to 9. Thus, the node C uses, as a new logical clock value, a value (=10) obtained by adding 1 to the logical clock value of the node C. As described above, each node performs the processing while changing the logical clock value. However, even though the processing associated with one logical clock value is completed at one node, processing associated with the same logical clock value is not necessarily completed at other nodes.