This invention relates to a technique for securing the consistency of data in a data base system of a master-slave configuration in which a duplicate of the original data base held in a master DB (data base) computer (hereinafter sometimes referred to simply as “the master”) is held in a slave DB computer (hereinafter sometimes referred to simply as “the slave”).
In a large-scale system in which data is accessed and updated, the data base often forms a bottleneck against the performance. This is caused by the fact that a great amount of requests generated from a multiplicity of applications are concentrated on a single DB computer to such a degree that all the requests cannot be processed (punctured). In such a case, the common practice is to add a DB computer to hold the same data as the original DB computer and thus to balance the load by distributing the requests from, the applications.
In a system having a plurality of DB computers, a method is required in which, upon reception of an update request, the update is reflected in all the DB computers. For this purpose, an ordinary DBMS (Data Base Management System) has the function called the replication. The “replication” is the function by which an update generated in a given DBMS is reflected in other DBMSs by transmitting an update log, etc. storing the update information.
Two methods described below are conceivable to update the data of the data base system utilizing the replication. In the first method, an update request is accepted by all the DB computers, and the update log is transmitted between the DB computers to reflect the update by each other. In the second method, an update request is accepted only by one predetermined DB computer (master), and the data of the other DB computers (slaves) is updated only by the update log transmitted from the master.
The first method, can balance the load of the update process, and therefore, superior in the processing performance. This method, however, may cause a inconsistency of the data base.
The situation in which this inconsistency is generated is explained with reference to FIG. 25. Assume that two DB computers 2500, 2510 initially hold the data of the same value “300” (reference numerals 2501, 2511). In the case where an update request 2502 to add “100” to the data is transmitted, to the DB computer 2500, the data held by the DB computer 2500 takes the value “400” (reference numeral 2504). Assuming that the update request 2512 to add “150” to the data is transmitted to the DB computer 2510 immediately after that, the data held by the DB computer 2510 assumes the value “450” (reference numeral 2514).
After that, the update log for the data value “400” is transmitted from the DB computer 2500 to the DB computer 2510 (reference numeral 2505), while the update log for the data value “450” is transmitted from the DB computer 2510 to the DB computer 2500 (reference numeral 2515). As a result, the DB computer 2500 reflects the update log accepted, and assumes the data value “450” (reference numeral 2506), while the DB computer 2510 reflects the update log accepted and assumes the data value “400” (reference numeral 2516). In this way, the inconsistency occurs in which the data held in the two DB computers 2500, 2510 have different values from, each other.
JP-A-11-7403 discloses a technique to determine which update is given, priority by utilizing the header information attached to the update log to secure that all the DB computers finally hold the same content of the data. According to this technique, the two DB computers each hold only one of the data values “400” and “450” and thus no inconsistency is generated in the aforementioned case shown in FIG. 25. From the viewpoint from the user of the data base system configured of the two DB computers 2500, 2510, however, the result of the transmission of the update requests “+100” and “+150” to the original data “300” is required to be “550”. The value “400” or “450”, whichever is the result, has lost one of the two update requests. This problem is called “the lost update”, and cannot be avoided by the method disclosed in JP-A-11-7403.
According to the second method described above, on the other hand, the update is always processed by the master, and therefore, the “lost update” problem is avoided. This method is disclosed in Cal Henderson: “Building Scalable Web Sites”, published by O'Reilly Media, Inc., May 2006, p.232-234. In this method, however, the slave reflects the update by receiving the update log from the master, and therefore, it may take considerable time before the update is reflected in all the slaves from the time of completion of the update process. Also, since the load of the update process cannot be balanced, the update process performance is not improved even by increasing the number of the DB computers.
Each of the two methods described above has both merits and demerits in respect of the data consistency and the process performance, and the appropriate one of the methods is required to be selected in accordance with the conditions to be met. In a system providing the Web service such as SNS (Social Network Service) or the e-commerce, an increased number of users generates a great amount of requests to the data base, and therefore, the requirement to construct a data base system of a plurality of DB computers is increased. In many of these systems, an update conflict, if generated, is not permitted to cause a lost update, and therefore, a method is often employed in which an update is accepted only by the master and reflected asynchronously.
In the aforementioned data base system of master-slave configuration in which the update is accepted only by the master and reflected asynchronously by transmitting the update log to the slaves, however, it takes considerable time before the update is reflected in the slave after the update request is issued. This poses the problem that despite the normal completion of the update request, the old data before the update may be accessed by the immediately subsequent access request to the slave. This indicates that in the service such as the on-line auction, the inconvenience may occur in which the bid, if successful, is immediately followed by the display of the data before the bid.
Also, the transmission of the access request to the master, though accessible to the latest update result, poses the problem that a part of the access requests are centrally processed by the master in addition to the update requests, and therefore, the effect of the load balance which otherwise might be achieved by the distributive request process is reduced.