The present invention relates generally to the field of database failover and more particularly to database failover in systems with database management systems (DBMS) clusters.
In the field of computer science and computing, failover is defined as switching to a standby or redundant computer server, computer system, hardware component or computer network upon the failure or abnormal termination of a previously active application, server, system, network, or hardware component. Failover is typically applied automatically and usually operates without warning. Designers of computer systems typically provide failover capability in servers, systems or networks that require continuous availability. Likewise, failback is the process of restoring a system, network, service, or component, which is in a state of failover, back to its original state before the failure occurred.
At a server level, failover automation typically uses a physical connection between two (2) servers. As long as a connection remains between the main server and the second server, the second server will not initiate, or turn on, its systems. There may also be a third server that has running spare components for “hot switching” to prevent downtime. The second server takes over the work of the first server as soon as it detects an alteration in the connection of the first server. In addition, some systems have the ability to send a notification of failover.
Clustering is one of the common technologies adopted by DBMS (database management system) companies to obtain continuous database availability. Each cluster (herein also known as a group) consists of multiple database servers (also known as members). An advanced clustering configuration involves the existence of multiple clusters, where one cluster is active (called the primary), and the members within that cluster are responsible for servicing all applications with active transactions distributed among the members, according to different workload balancing algorithms. The remaining clusters are on standby and will take over the role of the primary cluster, only in the event the primary cluster goes down.