This invention relates to main-memory database systems and, in particular to a serial protocol for improved transaction execution in main-memory database systems under certain workloads.
Many applications, particularly in telecommunications, have response-time and throughput requirements which cannot be met by conventional, disk-based database systems. For example, number translation, fraud prevention and pre-pay billing all require database accesses during the critical set-up phase of a telephone call (between the time when the caller completes dialing and when they receive the ring-back tone). These database accesses must meet response-time goals on the order of tens of milliseconds, which is an order of magnitude faster than conventional, disk-based systems can deliver.
Consider, for instance, the problem of detecting and preventing fraudulent use of a telephone network. One approach to this problem is to maintain summaries, termed xe2x80x98fraud signaturesxe2x80x99, for each customer. Fraud signatures capture each customer""s typically usage pattern, and are updated with each call. They can be used to detect irregular usage patternsxe2x80x94with similarity to known fraudulent patternsxe2x80x94when and if they occur. If a call is identified as potentially fraudulent, then it is rerouted to a voice recognition system for authentication, or to an operator. This application requires a fraud database to be accessed during call set-up, and updated in a timely manner upon call completion. The database workload consists of rifleshot transactions, i.e. short, response-time critical transactions, mainly keyed by customer identifier, with a roughly 50xe2x80x9450 mix of read-only transactions to update transactions.
Another network application requiring database processing during call set-up is toll-free number mapping. In the US, xe2x80x98toll-free numbersxe2x80x99 are those where the charge for a call is assigned to the callee (rather than to the caller). However, a toll-free number is also mapped (within the network) to different terminating numbers based on criteria such as the day, the time of day, and the load-balancing policies in effect. For example, calls using the same toll-free number may be mapped to an eastern location during the morning, and to a western location in a different time zone during the evening and at night. During the day, a load-balancing algorithm may be used to distribute calls between these two locations. For this application, the database workload again consists of response-time critical, rifleshot transactions, mainly keyed by the toll-free number. While there may be tens of millions of read-only transactions per day, there are typically only a few thousand updates.
These applicationsxe2x80x94and other similar network applicationsxe2x80x94exhibit the following characteristics and requirements:
The predominance of short, xe2x80x98rifleshotxe2x80x99 transactions, where each transaction consists of one or a small number of keyed table accesses using existing indexes that are frequently hashed;
In many cases (including those mentioned above), predictable response timesxe2x80x94on the order of only tens of millisecondsxe2x80x94are required such that transaction processing can take place during the critical set-up phase of a telephone call; and
Transactions do not block while processing for factors such as user input or disk I/O (other than for logging).
To meet the demands of such applications, a main-memory database system (xe2x80x9cMMDBSxe2x80x9d) is often used. A main-memory database system is a database system in which all database data is normally resident in high-speed main memory (e.g. random access memory). Secondary memory, which is slower than main memory (e.g. disk storage), is used only for logs, checkpoints, and archive copies of the database. During normal processing, the only I/O operations to secondary memory are those for logging and checkpointing.
To illustrate the performance of MMDBSxe2x80x2 consider the existing MMDBS described in Bohannon, The Architecture of the Dali Main Memory Storage Manager, The Bell Technical Journal, 2(1):36-47, Winter 1997). In a competitive performance evaluation comparing such a MMDBS to a commercial, disk-based relational Data Base Management System (RDBMS) (using an update-intensive telecommunications application), the MMDBS demonstrated both throughput and response times an order of magnitude better than its disk-based counterpart. With falling memory prices, general-purpose main-memory database systems have become a cost-effective alternative to custom systems, and are increasingly being used for applications with stringent performance requirements.
Many architectural issues must be reconsidered in the design of an MMDBS, one of which is concurrency. Disk-based systems execute transactions concurrently in order to improve both throughput and response time. Concurrency allows one transaction to continue processing while another is blocked waiting for an I/O operation to complete. If several transactions generate I/O requests at roughly the same time, then those requests can be scheduled in an order which optimizes some metric such as throughput, or CPU utilization. In order to prevent anomalies due to interference between concurrent transactions, many database systems use locking protocols such as strict two-phase locking (or xe2x80x98strict 2PLxe2x80x99). (See e.g. Bernstein, concurrency Control and Recovery in Database Systems, Addison Wesley, 1987). Under strict 2PL, locks are acquired before database operations are executed, and released only when the transaction either commits or aborts. Therefore, a read transaction will have access to a database file only after a previous update transaction commits and the read transaction will never view the modifications resulting from an update transaction that were uncommitted and later unwound.
While strict 2PL ensures correctness, the acquiring and releasing locks generate a system overhead that has an adverse impact on both throughput and response time. Also, locking protocols are generally prone to deadlock and unpredictable response times which further affects system performance. Nevertheless, in disk-based database systems, these effects are marginalized by the performance advantages of concurrent execution.
In MMDBSs, however, the locking overhead resulting from the use of locking protocols has a serious impact on system performance. Thus, there is an unmet need in the prior art to provide a transaction execution protocol for MMDBSs that assures that only committed data is accessible and that also eliminates the overhead associated with the prior art locking techniques.
The present invention is directed to overcoming the shortcomings of the prior art. The present invention provides a method for improving the performance of a system having a main-memory database which has at least one database containing at least one data item, and having a permanent storage device. The system executes at least one update transaction for modifying the data item to produce update results, and at least one read transaction for reading the data item. The method of the present inventions includes the steps of requiring the update transaction to acquire a database mutex before the update transaction modifies the data item. As is known in the art, a mutex is generally defined as a synchronization object that allows multiple threads to serialize their access to shared data. Next a timestamp is assigned to the update transaction and to the data item to be updated by the update transaction. After the update transaction updates the data item and before it release the database mutex, the update transaction acquires a mutex associated with the timestamp. Before the update results are stored to the permanent storage device, the update transaction releases the database mutex. Finally, the update transaction releases the mutex associated with the timestamp after the update results are stored to the permanent storage device. Before a read transaction is allowed to read the data item, the read transaction must acquire the database mutex. To ensure that the read transaction read data that was committed by the predecessor update transaction, the read transaction must, before terminating, acquire and release the mutex associated with the timestamp. Accordingly, by using the method of the present invention, a transaction execution protocol for MMDBSs is provided that assures only committed data is accessible. Also, because the method of the present invention forces transactions that are vying for the same data item to execute serially, the overhead associated with the prior art locking techniques is eliminated.
Other objects and features of the present invention will become apparent from the following detailed description, considered in conjunction with the accompanying drawing figures. It is to be understood, however, that the drawings, which are not to scale, are designed solely for the purpose of illustration and not as a definition of the limits of the invention, for which reference should be made to the appended claims.