1. Field of the Invention
This invention relates to a method, system, and program for automatically monitoring data consistency across a database and, more particularly, to a method for auditing, asynchronously, transactions between application servers and a database server.
2. Description of the Related Art
Computers and the worldwide web are now a major backbone of businesses of all kinds, including traditional “bricks and mortar” businesses as well as web-based or E-businesses. E-commerce sites perform thousands of financial transactions on a daily basis. Banks and other financial institutions have both walk-in customers and customers utilizing electronic transaction methods (computer banking, ATM machines and the like); all of these transactions are executed via computer systems accessing and updating one or more databases via a database server.
Obviously, accuracy is vital to the performance of these -massive numbers of calculations and transactions. A single error can compound into an unlimited number of subsequent errors if it is considered that that single error generates an incorrect number that is then used by all further transactions occurring thereafter. The impact of such errors can be devastating to anyone or any organization relying on the accuracy of the numbers.
It is a significant challenge, in web-based or E-business-based applications developed to run on application server middleware, to ensure and verify application data integrity in the database. In these scenarios, multiple execution threads from one or more application servers may simultaneously read and update the same data. It is common in these scenarios for two or more “execution threads” to attempt to simultaneously read, update and then write back a new value to the same database row. This can result in a subtle, but very serious, silent error, causing application data to be usable, but invalid. This is referred to herein as data corruption.
As an example, consider a banking scenario, where two threads may attempt to update the balance of a user's account simultaneously. This could easily happen in a real-life scenario where a husband and wife each access their bank account via separate ATM machines at about the same time. The following table illustrates the two separate threads. In Table 1 below, data corruption in the form of an incorrect user account balance will occur if Thread 1 and Thread 2 process the account credit method in the alphabetically-ordered sequence. This example assumes an initial user balance of $1,000.
TABLE 1Thread 1Thread 2a) Account.credit ($200)d) Account.credit ($100)b) currentBalance = $1,000e) currentBalance = $1,000(getBalanceFromDB( ))(getBalanceFromDB( ))c) newBalance = $1,200f) newBalance = $1,100($1,000 + 200)($1,000 + 100)g) updateBalanceInDB ($1,200)h) updateBalanceInDB ($1,100)
In this example, the user's account is credited $200 (Thread 1) and $100 (Thread 2). This is equivalent to the husband depositing $200 and the wife depositing $100. Therefore, the ending balance should be $1,300. However, the final update in Statement H leaves the user's balance at $1,100. This may go unnoticed for an extended period of time, obviously, a very serious problem. The reason for this error is simple and inconspicuous; Thread 2 commences accessing the data before Thread 1 has completed the updating step, statement G. Thus, it begins the transaction without taking into consideration the $200 credit of Thread 1. The updating step of Statement H simply overwrites the updating step of Statement G.
Verifying application data integrity is currently a very difficult problem to solve. This is due to several challenges. First, data corruption is a subtle and silent error. That is, data corruption, such as in the example given in Table 1, does not produce an application error of any kind. Data can be corrupted very subtly and is impossible to detect without a serialized account of the transaction execution.
In addition, very small timing windows in relation to the length of transactions can cause data corruption. These timing windows will not generally be encountered in a simple test environment, but will show up in a production, high-concurrency transaction processing environment. These timing windows may be relatively rare and therefore data validation must be able to be enabled easily and for long periods of time without affecting normal application flow and processing.
Data corruption problems have been attempted to be solved, e.g., using leading database software programs, by increasing the default “isolation level” of transactions. Isolation level is a database-specific configuration setting. By increasing the isolation level, transactions can be serialized to guarantee data integrity. That is, each transaction must be completed before the next transaction is commenced. This course grain approach diminishes the number of data errors. However, performance is severely affected, because transactions are serialized and not processed concurrently. Accordingly, this option is rarely chosen except in a test environment.
The second and more common option for solving this problem involves the application (or application server) correctly allocating locks on individual database rows to enforce data integrity. In the scenario of Table 1, if Statements B and E both acquire an “update” lock on the database row for the user's account balance as the row is read, the first thread to reach the statement will be permitted to read the row data. The second thread to execute the statement also attempts to allocate an update lock and will not be permitted to read the row data until the first thread's transaction completes. This provides data integrity and a correct account balance.
This fine grain application locking method provides the best solution for data integrity. However, the key to using the application locking method to achieve data integrity successfully is to minimize row locking to avoid deadlocks and transaction failures, along with the ability to verify data integrity at run time. Thus, locking provides a limited solution, since less locking means less data integrity. In addition, databases do not support update locking in various scenarios such as “join selects”, aggregated queries, etc. In these scenarios, it is impossible for the database to ensure data integrity with standard isolation settings.
Another approach is to integrate application level trace code to write out full transaction details in order to detect the problem. This highly intrusive approach can severely slow an application runtime or otherwise change runtime characteristics such that data corruption is not encountered. This type of tracing is not generally acceptable to a production environment.
Accordingly, there is a need for technology that provides production level, high performance, integrated, “real-time” and automatic/visual data integrity verification.