Some embodiments of the present disclosure are directed to an improved approach for implementing failover and resume when using ordered sequences in a multi-instance database environment. More particularly, disclosed herein according to some embodiments are a method and system for implementing failover and resume when using ordered sequences in a multi-instance database environment.
Overview of Ordered Sequences
In a modern database system for processing transactions (e.g., commercial transactions such as purchase orders, debits, credits, etc.) many users can use the system at the same time, and many users may have the same sorts of operations to perform. For example, if a user, say User A, desires to process a batch of purchase orders, he or she might want to assign each one of those purchase orders in the batch a unique number. And, it might be desired to assign those purchase orders a unique number within a contiguous sequence (e.g., PO-0001, PO-0002, PO-0003, etc.).
One technique is to assign a large range of contiguous values for all users to access and ‘check-out’ a contiguous sequence. For example, if User A desired to process a batch of say, 20 purchase orders, he or she might request a sequence comprising 20 contiguous values (e.g., 0001, 0002, 0003, . . . 0020). However a different user, say User B, might at the same time also desire to process a batch of purchase orders, and could at the same time request a sequence comprising 20 contiguous values. One legacy technique for ensuring that User A and User B do not receive the same sequence comprising 20 contiguous values is to force all requests to be serialized. There are various techniques for serialization of requests, often involving serialization of requests using a flag or latch (or any implementation of a semaphore). In such a case for using a flag or latch, a first user (say User A) is granted access to the list of contiguous sequences, while any next users must wait. Then the first user is given the requested sequence (in this example, numbers 0001-0020), and the next waiting user's request is then processed. Given that the first user's request was satisfied (thus, the next available would be 0021) the first waiting user's request (e.g., a sequence of 20 contiguous values) can be satisfied by returning the sequence 0021, 0022, 0023 through 0040, and so on.
Many application environments operate on mission-critical data that might need the aforementioned ordered sequences, and resilience and redundancy are provided by implementing database environments comprising multiple instances of a database, each of which instance might share at least some of the same components provided in the environment. When a failure does occur, some mission-critical applications need a “graceful” failover from one database instance to another database instance. Similarly, after a failed component has been repaired or replaced, the mission-critical applications need a “graceful” resume. The sense of “graceful” here includes the sense of a satisfactory restoration after failure/resume and the sense of satisfactory performance before, during, and after a failure.
Some techniques have been tried where entire checkpoints are taken periodically and saved in order to be resumed after a failure, however, such techniques do not have the capabilities to perform “gracefully” in general, nor do the aforementioned techniques have the capability to implement “graceful” failover and resume operations.
Therefore, there is a need for an improved approach for implementing failover and resume when using ordered sequences in a multi-instance database environment.