The present invention relates in general to replicated database systems, and, more particularly, to a system and method for improving the reliability of replicated database systems.
Many industry applications, such as airline reservation systems, banking applications and telecommunications, require reliable database transaction processing. A transaction is a sequence of actions or operations that must be performed in its entirety or not at all. Any degradation in reliability has a negative impact on the customer (e.g., the individual making an airline reservation) as well as the service provider (e.g., the airline). For example, if airline transactions are not reliably processed, customers may experience flight overbooking and, as a result, may choose to fly a different airline.
Historically, systems requiring highly reliable transaction processing have been built using highly fault tolerant computing hardware. Typically these systems are guaranteed to function up to 99.9999% of the time which equates to only 30 minutes of downtime/failure time per year. Such a highly fault tolerant computing system is also service reliable as it is nearly always available. Service reliability is a primary objective in designing such computing hardware.
Fault tolerant systems are very expensive and tend to lag behind the xe2x80x9ctechnology innovation curvexe2x80x9d by several years. For example, a fault tolerant system may only support on the order of megabytes of memory while a non-fault tolerant system may support on the order of gigabytes of memory. The fault tolerant system providers must transform commercially available hardware into reliability hardened hardware for fault tolerant systems which typically takes two to three years. This transformation process typically entails providing hardware component redundancy and sophisticated hardware failure detection. The underlying objective is to provide a system that rarely fails and is thus nearly always available.
In recent years non-fault tolerant computing systems have become very inexpensive and provide orders of magnitude greater performance in both memory capacities and processor speeds. While the performance of the computing systems have increased, the cost of such computing systems have decreased significantly such that a number of replicated non-fault tolerant computing systems is inexpensive compared to the corresponding fault tolerant computing system. Service reliability may be achieved utilizing less expensive but less reliable computing systems. Instead of utilizing expensive fault tolerant systems, service reliability can be achieved by utilizing redundant non-fault tolerant systems. The idea is to provide an appropriate number of fully replicated systems so that individual system failures do not negatively affect service reliability. Since these systems provide orders of magnitude more performance and capacity, the end result is higher performance and lower cost computing networks with service reliability that is as good as, and often better than, existing networks utilizing fault tolerant systems.
However, there are a number of factors that must be addressed so that a transaction is reliably processed in the midst of system failures. Transaction processing is typically divided into two distinct categories: executing the intent of the transaction and updating/reading the database; and, maintaining the transaction state data required to execute the transaction. There are well understood methods of reliably performing the first category of transaction processing and assuring that data already in a database is stored reliably, e.g., two-phase-commit protocols, redundant array of independent disks (RAID) and shared disk systems.
Transaction state data (TSD) is transient data associated with a single transaction. It is data that is maintained for the life of a transaction to assure that all the parts/messages of the transaction are executed correctly. There is a difference between the data corresponding to the records in a database, e.g., customer specific data, and TSD. Each record in a database includes data unique to that record while TSD is created by the computing system to assure that multiple messages for a particular transaction are correlated appropriately. TSD is also created to store intermediate data needed for subsequent message processing for the transaction.
Referring now to FIG. 1, a typical replicated database system 10 is illustrated. The replicated database system 10 comprises a querying system 20 and a logical database 30. The querying system 20 is the system from which all transactions originate. The querying system 20 is configured to generate transactions accessing records in the logical database 30 in response to requests by one of a number of database users 60 accessing the database system 10. For example, the querying system 20 could be part of an airline reservation system with users accessing and entering reservation data via terminals or a telecommunications switch that is sending transactions to a calling card database for card and personal identification number (PIN) validation. It is the system that requires reliable transaction processing and data storage.
The logical database 30 is the system to which the querying system 20 sends transaction messages for reliable processing and data storage. It is referred to as a logical database because it is actually comprised of multiple physical databases. In the illustrated system, the logical database 30 comprises two physical databases, an active database 40 and a standby database 50. The logical database 30 also comprises a reliable transaction distributor 70 which receives transaction messages from the querying system 20 and transmits them to one of the databases 40, 50. The reliable transaction distributor 70 also receives response messages from one of the databases 40, 50 and transmits them to the querying system 20. It will be appreciated by those skilled in the art that the reliable transaction distributor 70 may be located within the querying system 20. It will be further appreciated by those skilled in the art that the querying system 20 and the logical database 30 may each comprise a reliable transaction distributor. It will be even further appreciated by those skilled in the art that the reliable transaction distributor 70 is viewed logically as one system but could be comprised of a plurality of physical systems.
The exact number of physical databases within the logical database 30 is completely transparent to the querying system 20. The reliable transaction distributor 70 is the only component that needs to keep track of the number of physical databases. The databases 40, 50 are fully replicated with each database comprising an identical set of records. The active database 40 is the database that receives transaction messages from the reliable transaction distributor 70 during normal operations, i.e., absent any failures in the active database 40. The backup database 50 is available for use whenever the active database 40 is experiencing problems.
A typical transaction is described below. Message 1 of the transaction is sent from the querying system 20 to the logical database 30 for processing (step 1). The reliable transaction database 70 receives message 1 and transmits it to the active database 40 for processing (step 2). The active database 40 retrieves data from an appropriate record in the database and creates internal TSD for subsequent message processing of the transaction. The active database 40 then creates and sends a response message based on the data retrieved from the database and message 1 to the reliable transaction distributor 70 (step 3) for transmission to the querying system 20 (step 4). The querying system 20 then monitors the transaction and transmits a termination message to the logical database 30 (step 5) at the conclusion of the transaction from the view of the querying system 20. The reliable transaction distributor 70 receives the termination message and transmits it to the active database 40 for processing (step 6). The active database 40 processes the termination message using the TSD. The active database 40 also updates the appropriate records in the database as necessary. The active database 40 transmits the updated data to the backup database 50 (step 7). The backup database 50 updates the appropriate records in the database so that the databases 40, 50 remain fully replicated.
The above illustration could be used in a telecommunication system where a customer uses a pre-paid calling card to place a call. The querying system 20 transmits customer identification data and call originating data in the form of message 1 to the logical database 30 for customer and PIN identification. The active database 40 confirms the customer information from the database along with the balance available on the card. The TSD created by the active database 40 includes message correlation data and call originating data for deriving billing information. The response message instructs the querying system 20 that the customer identification data has been verified and that the call may proceed. The querying system 20 connects the call and monitors the call to determine when the call is completed and disconnected. The querying system 20 determines the duration of the call and transmits this information in the form of the termination message to the logical database 30. The active database 40 derives a charge for the call and subtracts the charge from the balance in the database. The new balance is then transmitted to the backup database 50 so that the databases 40, 50 remain fully replicated.
Referring now to FIG. 2, the above transaction will be described assuming that the active database 40 fails after the response message is created and transmitted to the reliable transaction distributor 70 (step 3). The reliable transaction distributor 70 receives the response message and transmits it to the querying system 20 (step 4). Using the calling card example, the call is then connected. The querying system 20 transmits the termination message to the logical database 30 (step 5). Since the active database 40 has failed, the reliable transaction distributor 70 transmits the termination message to the backup database 50 (step 6xe2x80x2). Unfortunately, the backup database 50 does not contain any TSD for the transaction such that message correlation fails and the message is not processed. The transaction itself fails since all of the steps of the transaction could not be completed.
Accordingly, there is a need for a replicated database system and a method for processing a transaction in such a system that allows a transaction to be completed even after the active database fails. There is another need for such a replicated database system that utilizes non-fault tolerant components but maintains service reliability. Preferably, such a system is relatively easy to implement and cost effective.
The present invention meets these needs by providing a database system comprising a querying system and a logical database having an active database and a backup database. The querying system transmits a message for a transaction to the logical database for processing. The message is transmitted to the active database where the message is processed. The active database creates transaction state data based, in part, on the message. The active database transmits a response message to the querying system and forwards the original message to the backup database. The backup database processes the original message and creates its own transaction state data. The transaction state data in the backup database matches the transaction state data in the active database so that if the active database fails, the backup database includes the requisite transaction state data necessary to complete the transaction.
The querying system processes the response message and transmits a termination message to the logical database for processing. The termination message is transmitted to the active database and processed. The termination message is forwarded to the backup database from the active database and similarly processed. The transaction is then complete and both databases are fully replicated.
According to a first aspect of the present invention, a method for processing a transaction in a replicated database system is provided. The database system comprises a plurality of replicated databases, an active database and at least one backup database. Transaction messages for the transaction are received from a querying system in the active database. The transaction messages are processed for the transaction in the active database. The active database creates response messages based on the transaction messages. The active database also creates active database transaction state data representative of the transaction. The response messages are transmitted to the querying system. The transaction messages for the transaction are forwarded from the active database to the backup database. The transaction messages of the transaction are processed in the backup database with the backup database creating backup database transaction state data representative of the transaction.
The step of processing the transaction messages of the transaction in the backup database may further comprise the step of creating a suppressed response message in the backup database based on the transaction messages. Alternatively, the step of processing the transaction messages of the transaction in the backup database does not comprise the step of creating a response message in the backup database based on the transaction messages forwarded by the active database. The step of forwarding the transaction messages for the transaction from the active database to the backup database may comprise the step of transmitting control header data from the active database to the backup database with the control header data enabling the backup database to process the first message consistent with the active database. The control header data may comprise unique transaction identification data representative of the transaction or time stamp data. The steps of the method may be repeated for a plurality of transactions. The method may further comprise the step of serializing the plurality of transactions so that the plurality of transactions are processed in the same order in both the backup database and the active database.
According to another aspect of the present invention, a method for processing a transaction in a replicated database system is provided. The database system comprises a plurality of replicated databases, an active database and at least one backup database. A first message for the transaction is transmitted from a querying system to the database system. The first message for the transaction is processed in the active database with the active database creating a response message based on the first message. The active database also creates active database transaction state data representative of the transaction. The response message is transmitted to the querying system. The first message for the transaction is forwarded from the active database to the backup database. The first message for the transaction is processed in the backup database with the backup database creating backup database transaction state data representative of the transaction. A second message for the transaction is transmitted from the querying system to the database system. The second message for the transaction is processed in the active database using the active database transaction state data. The second message for the transaction is forwarded from the active database to the backup database. The second message for the transaction is processed in the backup database using the backup database transaction state data.
The step of processing the first transaction message of the transaction in the backup database may further comprise the step of creating a suppressed response message in the backup database based on the first transaction message. Alternately, the step of processing the first transaction message of the transaction in the backup database does not comprise the step of creating a response message in the backup database based on the first transaction message forwarded by the active database. The step of forwarding the first message for the transaction from the active database to the backup database may further comprise the step of transmitting control header data from the active database to the backup database with the control header data enabling the backup database to process the first message consistent with the active database. The control header data may comprise unique transaction identification data representative of the transaction or time stamp data. The steps of the method may be repeated for a plurality of transactions. The method may further comprise the step of serializing the plurality of transactions so that the plurality of transactions are processed in the same order in the both the backup database and the active database.
The step of processing the first message for the transaction in the active database may comprise the step of storing the active database transaction state data in the active database. The step of processing the first message of the transaction in the backup database may comprise the step of storing the backup database transaction state data in the backup database. The step of processing the first message for the transaction in the active database may comprise the step of accessing at least one of a plurality of records in the active database. The step of processing the first message of the transaction in the backup database may comprise the step of accessing at least one of a plurality of records in the backup database.
According to yet another embodiment of the present invention, a method for processing a transaction in a replicated database system is provided. The database system comprises a plurality of replicated databases, an active database and at least one backup database. A first message for the transaction is transmitted from a querying system to the database system. The first message for the transaction is processed in the active database with the active database creating a response message based on the first message. The active database also creates active database transaction state data representative of the transaction. The response message is transmitted to the querying system while the first message for the transaction is forwarded from the active database to the backup database. The first message of the transaction is processed in the backup database with the backup database creating backup database transaction state data representative of the transaction. A second message for the transaction is transmitted from the querying system to the database system. If the active database is available, then the second message is processed by the active database using the active database transaction state data. The second message for the transaction is also forwarded from the active database to the backup database and processed using the backup database transaction state data. Otherwise, the second message for the transaction is processed by the backup database using the backup database transaction state data.
The step of forwarding the first message for the transaction from the active database to the backup database may comprise the step of transmitting control header data from the active database to the backup database with the control header data enabling the backup database to process the first message consistent with the active database. The control header data may comprise unique transaction identification data representative of the transaction or time stamp data. The steps of the method may be repeated for a plurality of transactions. The method may further comprise the step of serializing the plurality of transactions so that the plurality of transactions are processed in the same order in both the backup database and the active database.
According to a further aspect of the present invention, a transaction processing system comprises a database system and a database querying system. The database system comprises a plurality of replicated databases including an active database and at least one backup database. The active database comprises an active database processor and the backup database comprises a backup database processor. The database querying system comprises a database querying system processor configured to access the database system for processing transactions. The database querying system processor is programmed to transmit a first message for one of the transactions to the database system for processing. The active database processor is programmed to process the first message thereby creating a response message and active database transaction state data representative of the transaction. The active database processor is programmed to transmit the response message to the querying system and to forward the first message to the backup database for processing by the backup database processor. The backup database processor is programmed to process the first message thereby creating backup database transaction state data representative of the transaction. The database querying system is further programmed to transmit a second message for the transaction to the database system for processing. The active database processor is further programmed to process the second message using the active database transaction data, and to forward the second message to the backup database. The backup database processor is further programmed to process the second message using the backup database transaction state data.
The backup processor may be programmed to create a suppressed response message in the backup database based on the first transaction message. Alternatively, the backup processor may be programmed not to create a response message in the backup database based on the first transaction message forwarded by the active database. The active database processor may be further programmed to transmit control header data along with the first message to the backup database so that the backup database processor can process the first message consistent with the active database processor. The control header data may comprise unique transaction identification data representative of the transaction or time stamp data. The database querying system processor, the active database processor and the backup database processor may be programmed to process a plurality of transactions. Preferably, the active database processor is programmed to serialize the plurality of transactions so that the plurality of transactions are processed in the same order in the backup database as in the active database.
Preferably, the active database may comprise an active database memory for storing the active database transaction state data. The backup database may comprise a backup database memory for storing the backup database transaction data. The database system may further comprise a plurality of backup databases. The active database processor may be further programmed to access at least one of a plurality of records in the active database to process the first message. Similarly, the backup database processor may be further programmed to access at least one of a plurality of records in the backup database to process the first message.
According to a still further aspect of the present invention, a transaction processing system comprises a database system and a database querying system. The database system comprises a plurality of replicated databases including an active database and at least one backup database. The active database comprises an active database processor and the backup database comprises a backup database processor. The database querying system comprises a database querying system processor configured to access the database system for processing transactions. The database querying system processor is programmed to transmit a first message for one of the transactions to the database system for processing. The active database processor is programmed to process the first message thereby creating a response message and active database transaction state data representative of the one transaction. The active database processor is further programmed to transmit the response message to the querying system for processing, and to forward the first message for the one transaction to the backup database for processing. The backup database processor is programmed to process the first message thereby creating backup database transaction state data representative of the transaction. The database querying system processor is further programmed to transmit a second message for the one transaction to the database system for processing. If the active database is available, then the second message is processed by the active database processor. The active database processor is further programmed to process the second message using the active database transaction data, and to forward the second message to the backup database. The backup database processor is further programmed to process the second message using the backup database transaction state data. Otherwise, the second message for the transaction is processed by the backup database processor. The backup database processor is programmed to process the second message using the backup database transaction state data.
Accordingly, it is an object of the present invention to provide an improved database system and a method of processing a transaction in such a system that allows a transaction to be completed even after an active database fails. It is another object of the present invention to provide such a replicated database system that utilizes non-fault tolerant components but maintains required service reliability. It is yet another object of the present invention to provide such a system that is relatively easy to implement and cost effective. Other features and advantages of the invention will be apparent from the following description, the accompanying drawings and the appended claims.