1. Field of the Invention
The present invention relates to a method for adjusting current clock counts for each computer site in a distributed computer system to effectively execute timestamped transactions.
2. Description of the Background
A distributed computer system comprises a number of computer sites, each of which has a processor and stores a database in a disk. The computer sites are interconnected by a network of communication lines. When a site receives a data processing request from a user, a processor in the site determines which database in the system includes the necessary data for processing by referring to a system catalog that stores information for data location. If the necessary data is stored in the database of the same computer site as the processor, the processor reads the necessary data from its database, executes data processing operations on the data, and writes the result into the database. A series of these processes is referred to as a transaction. If the necessary data is stored in a database at another computer site, the processor sends a request to read the necessary data from the database to the other computer site using the communication lines of the network. A processor of the other computer site reads the data from its database and sends the data to the computer site which sent the request using the communication lines. Then the processor of the computer site executes data processing operations on this data.
FIG. 1 shows an example of two transactions which access the same data entry. It is assumed that the computer site corresponds to a branch of a bank, and the database stores data entries (records) representing the balance deposit for each of the bank's customers. When money is deposited or withdrawn from an account at any branch, a transaction must be performed by the computer system. During the transaction, the balance in the customer's account is read from a database, the money that is newly deposited or withdrawn by the customer or the bank is added to or subtracted from the balance for that customer, and the arithmetic result is written back into the data entry (record) in the database that corresponds to the account for that customer.
In this example, a data entry in the database at computer site A stores a customer's balance. It is assumed that the customer's beginning balance is $50. At computer site A, an additional $100 is deposited in the customer's account by a transaction that will be designated Ta. Immediately after, at another computer site B, $30 is deposited in that same account by another transaction that will be designated Tb. Under these circumstances, a processor at computer site A will execute a READ operation for transaction Ta. The processor will add $100 to the beginning balance of $50, and then writes $150 into the database as the new balance in the account during a WRITE operation for transaction Ta. However, a processor at computer site B may send a request for a READ operation for transaction Tb to computer site A, and the processor at computer site A execute the READ operation before execution of the WRITE operation for transaction Ta. As a result, the processor at computer site B will read the customer's beginning balance of $50. Then, the processor at computer site B adds $30 to the beginning balance of $50, and sends $80 as a request for a WRITE operation to computer site A. The processor at computer site A writes $80 into the database as the customer's new balance during a WRITE operation for transaction Tb. Therefore, the data entry in the database for this customer's account will indicate an ending balance of $80. However, the ending balance in this customer's account is actually $180.
To prevent such errors, a concurrency control mechanism is used, i.e. a timestamp method. According to this method, each computer site has its own clock. When a request is made to initiate a transaction, the computer site assigns a timestamp to the new transaction. Preferably, the timestamp includes a transaction clock count that corresponds to a current clock count for the computer site at that time. For example, if a customer deposits $100 at an automatic teller machine (ATM), a transaction request is generated by the computer site where the ATM is located in order to add $100 to the customer's account. This computer site determines the computer site corresponding to the customer's account by examining a key recorded on the customer's cash card. The computer site also examines a current clock count for the site that is provided by its clock and clock register. The computer site will generate a transaction request, consisting of a operation (READ), a record key (100), and a table name (ACCOUNT), as shown in FIG. 2.
FIGS. 3A and 3B show an example of two transactions that access the same data entry and timestamp table according to a timestamp method. It is assumed that computer site A and computer site B each have a clock and clock register that generate an accurate current clock count. When a transaction request is made by a customer or computer user at a branch corresponding to computer site A, computer site A assigns a transaction clock count and a computer site identifier to the transaction Ta. In this example, the current clock count is 10:00 and the identifier of the computer site is (A). As a result, computer site A assigns a timestamp 10:00 (A) to the transaction Ta, and will subsequently write this timestamp into the read timestamp field of the timestamp table corresponding to the customer's account when the transaction Ta reads data from the record. In this example, the write timestamp 0:00 is read out from timestamp table as shown in upper of FIG. 3B. The timestamp 10:00 (A) is larger than the write timestamp 0:00. Therefore, the transaction Ta reads data from the record and writes the timestamp 10:00 (A) in the read timestamp field to the timestamp table as shown in middle of FIG. 3B.
FIG. 4 shows the data structure of a timestamp table and corresponding database (table account). The timestamp table is previously stored in memory different from database. The timestamp table consists of a record key, a read timestamp, and a write timestamp. The database consists of a key, a name, a data entry. In this example, the data entry is balance deposit because this database is table account. Each item of the timestamp table corresponds to item of the database according to the key. For example, Mike's balance deposit was read and written by a transaction generated by computer site A at 10:00.
Immediately after computer site A requests a transaction, another transaction may be made at the branch corresponding to computer site B. Computer site B assigns the current timestamp 10:01 (B) to the new transaction Tb if the current clock count at computer site B is 10:01 when the transaction request is generated. When computer-site B sends a request of a READ operation to computer site A and the computer site A reads the data entry by record key 100, the write timestamp 0:00 corresponding to the record key 100 is read out. The timestamp 10:01 (B) is larger than the write timestamp 0:00. Therefore, the transaction Tb reads data from the record and writes the timestamp 10:01 (B) in the read timestamp field of the timestamp table as shown in lower of FIG. 3B. When transaction Ta writes execution result to the record, the read timestamp 10:01 (B) is read out from the timestamp table. The timestamp 10:00 (A) of transaction Ta is smaller than the read timestamp 10:01 (B). Therefore, write processing of transaction Ta is reject. Then, transaction Ta is aborted in computer site A. When transaction Tb writes execution result to the record, the read timestamp 10:01 (B) is read out from the timestamp table. The timestamp 10:01 (B) of transaction Tb is not smaller than the read timestamp 10:01 (B). Therefore, write processing of transaction Tb is accepted. Then, transaction Tb is committed (succeeded) in computer site B. (Transaction Ta is assigned larger timestamp and executed again.)
Unfortunately, the current clock counts in different computer sites are not always accurate and are not always the same. Even if the current clock counts in different computer sites are identical same during an initialization mode, as time progresses, the difference in the current clock counts of two computer sites will become larger and larger.
FIGS. 5A and 5B show an example of the attempted execution of two transactions by two computer sites that have different current clock counts and timestamp table. In this example, the current clock count of computer site A is early when compared to the current clock count of computer site B, and the current clock count of computer site B is late or delayed when compared to the current clock count of computer site A.
In this example, when a transaction request Ta is made at the branch corresponding to computer site A, computer site A assigns a transaction clock count 10:00 and an identifier (A) to the transaction request. Preferably, transaction clock count 10:00 corresponds to the current clock count at computer site A. Immediately thereafter, a transaction request Tb is made at the branch corresponding to computer site B, and computer site B assigns a transaction clock count 9:55 and an identifier (B) to the transaction request. The transaction clock count 9:55 corresponds to the current clock count at computer site B, which is late when compared with the current clock count at computer site A.
When computer site A executed a READ operation according to the transaction Ta, timestamp 10:00 (A) is written in read timestamp field of timestamp table as shown in upper of FIG. 5B. This processing is the same as that shown in FIG. 3A. When computer site A executes a READ operation according to the transaction Tb, write timestamp 0:00 is read out from timestamp table. The timestamp 9:55 (B) is larger than the write timestamp 0:00. Therefore, computer site A read the data entry by record key 100 and write the timestamp 9:55 (B) in read timestamp field as shown in the middle of FIG. 5B. When computer site A writes execution result of transaction Ta to the record, the read timestamp 9:55 (B) is read out from timestamp table. The timestamp 10:00 (A) is larger than the read timestamp 9:55 (B). Therefore, computer site A executes a WRITE operation and writes the timestamp 10:00 (A) in write timestamp field of timestamp table as shown in the lower of FIG. 5B. Then, transaction Ta is committed in computer site A as shown in FIG. 5A. When computer site A writes execution result of transaction Tb to the record, the read timestamp 9:55 (B) and write timestamp 10:00 (A) is read out from timestamp table. The timestamp 9:55 (B) is not smaller than the read timestamp 9:55 (B), but it is smaller than the write timestamp 10:00 (A). Therefore, write processing of transaction Tb is rejected. After transaction Tb is aborted, the current clock count for computer site B may be 10:01. In this situation, it is preferable for computer site B to write a new transaction clock count of 10:01 into the data structure corresponding to transaction request Tb. As a result, a new timestamp 10:01 (B) is assigned to transaction Tb.
Now, if computer site A again attempts to execute a READ operation according to transaction Tb, transaction Tb will not be aborted because the write timestamp 10:00 (A) is smaller than timestamp 10:01 (B) assigned to transaction Tb. See FIG. 3.
Thus, if one computer site executing a new transaction attempts to execute a READ operation on the database at a particular record key, but the read/write timestamp corresponding to the data entry is later than the timestamp assigned to the new transaction, then the computer site must abort the new transaction. As the difference in current clock counts between different computer sites becomes larger, the number of transactions that must be aborted by the computer site with the late current clock count increases.
To address this problem, a method for adjusting current clock counts is disclosed in Japanese Patent Disclosure (Kokai) P59-5553. FIG. 6 shows a distributed system computer taught by this reference. In FIG. 6, three computer sites 60, 62, and 64 already are interconnected by a network, and a new computer site 66 is being connected in the network.
As disclosed in Japanese Patent Disclosure (Kokai) P59-55553, new computer site 66 inquires about the current clock counts provided by the clocks and clock registers in existing computer sites 60, 62, and 64. When new computer site 66 receives information indicating the current clock counts of each of the existing computer sites, a clock offset OFS is calculated. The clock offset OFS is the amount of time that the current clock count corresponding to new computer site 66 will be advanced (i.e., the clock is made early rather than late).
In a distributed computer system of the type disclosed in Japanese Patent Disclosure (Kokai) P59-55553, the clock offset OFS is calculated by new computer site 66 by first selecting the current clock count from all of the computer sites that corresponds to the clock that is the fastest (e.g., 10:05 is selected instead of 10:00). Then, a clock offset OFS is added to the current clock count of new computer site 66 that will make its current clock count identical to the current clock count of the computer site with the fastest (i.e., earliest) clock. This process of calculating a clock offset OFS is repeated whenever a new computer site is connected to the network.
However, when using the method disclosed in Japanese Patent Disclosure (Kokai) P59-55553, it is difficult to accurately adjust the current clock count of the new computer site. Information indicating the current clock counts of each of the existing computer sites is received by new computer site 66 only after a communication delay, and this communication delay must be estimated correctly to determine the amount of time that the current clock count corresponding to new computer site 66 should be advanced.
Moreover, the method disclosed in Japanese Patent Disclosure (Kokai) P59-55553 does not provide any method for adjusting the current clock counts of existing computer sites. As time passes, the differences between the current clock counts of these computer sites will increase.