1. Field of the Invention
This invention pertains to software-based fault tolerant computer systems, computer networks, telecommunications systems, embedded computer systems, wireless devices such as cell phones and PDAs, and more particularly to methods, systems and procedures (i.e., programming) for live migration of applications both in response to an external event and in response to a system fault. Generally live migration is triggered by an event which allows for orderly live application migration between two operational systems. The present inventions furthermore provides live migration as an element of fault recovery, i.e. live migration is used to let an application continue execution on a backup server in event that the primary server crashes.
2. Description of Related Art
In many environments one of the most important features is to ensure that a running application continues to run even in the event of one or more system or software faults. Mission critical systems in telecommunications, military, financial and embedded applications must continue to provide their service even in the event of hardware or software faults. The auto-pilot on an airplane is designed to continue to operate even if some of the computer and instrumentation is damaged; the 911 emergency phone system is designed to operate even if the main phone system if severely damaged, and stock exchanges deploy software that keep the exchange running even if some of the routers and servers go down. Today, the same expectations of “fault-free” operations are being placed on commodity computer systems and standard applications.
Fault tolerant systems are based on the use of redundancy (replication) to mask faults. For hardware fault tolerance, servers, networking or subsystems are replicated. For application fault tolerance, the applications are replicated. Faults on the primary system or application are masked by having the backup system or application (the replica) take over and continue to provide the service. The take-over after a fault at the primary system is delicate and often very system or application specific.
Several approaches have been developed addressing the fundamental problem of providing fault tolerance. Tandem Computers (http://en.wikipedia.org/wiki/Tandem_computer) is an example of a computer system with custom hardware, custom operating system and custom applications, offering transaction-level fault tolerance. In this closed environment, with custom applications, operating system and hardware, a fault on the primary system can be masked down to the transaction boundary and the backup system and application take over seamlessly. The fault-detection and failover is performed in real-time.
In many telecommunication systems fault tolerance is built in. Redundant line cards are provided within the switch chassis, and if one line card goes down, the switching fabric automatically re-routes traffic and live connections to a backup line card. As with the Tandem systems, many telecommunications systems are essentially closed systems with custom hardware, custom operating systems and custom applications. The fault detection and failover is performed in real-time.
In enterprise software systems the general approach taken is the combined use of databases and high availability. By custom programming the applications with hooks for high-availability it is generally possible to detect and recovery from many, but not all, types of faults. In enterprise systems, it is typically considered “good enough” to recover the application's transactional state, and there are often no hard requirements that the recovery be performed in real-time. In general, rebuilding the transactional state for an application server can take as much as 30 minutes or longer. During this time, the application services, an e-commerce website for instance, is unavailable and cannot service customers. The very slow fault recovery can to some extent be alleviated by extensive use of clustering and highly customized applications, as evidenced by Amazon.com and ebay.com, but that is generally not a viable choice for most deployments.
In U.S. Pat. No. 7,228,452 Moser et al teach “transparent consistent semi-active and passive replication of multithreaded application programs”. Moser et al disclose a technique to replicate running applications across two or more servers. The teachings are limited to single process applications and only address replica consistency as it related to mutex operations and multi-threading. Moser's invention does not require any modification to the applications and work on commodity operating systems and hardware. Moser is incorporated herein in its entirety by reference.
The present invention builds on the teachings in U.S. patent application Ser. No. 12/877,144 titled SYSTEM AND METHOD FOR TRANSPARENT CONSISTENT APPLICATION-REPLICATION OF MULTI-PROCESS MULTI-THREADED APPLICATIONS, U.S. patent application Ser. No. 12/851,706 filed Aug. 6, 2010 titled SYSTEM AND METHOD FOR TRANSPARENT CONSISTENT APPLICATION-REPLICATION OF MULTI-PROCESS MULTI-THREADED APPLICATIONS, U.S. patent application Ser. No. 12/877,598 titled SYSTEM AND METHOD FOR RELIABLE NON-BLOCKING MESSAGING FOR MULTI-PROCESS APPLICATION REPLICATION, and U.S. patent application Ser. No. 12/877,651 titled SYSTEM AND METHOD FOR RELIABLE NON-BLOCKING MESSAGING FOR MULTI-PROCESS APPLICATION REPLICATION, wherein systems and methods for application replication and non-blocking messaging are disclosed.
The present invention also builds on the teachings of U.S. patent application Ser. No. 11/213,678 filed Aug. 25, 2005 titled METHOD AND SYSTEM FOR PROVIDING HIGH AVAILABILITY TO COMPUTER APPLICATIONS, U.S. patent application Ser. No. 12/334,660 filed Dec. 15, 2008 METHOD AND SYSTEM FOR PROVIDING CHECKPOINTING TO WINDOWS APPLICATION GROUPS, and U.S. patent application Ser. No. 12/334,651 filed on Dec. 15, 2008 titled METHOD AND SYSTEM FOR PROVIDING STORAGE CHECKPOINTING TO A GROUP OF INDEPENDENT COMPUTER APPLICATIONS, wherein systems and systems for checkpointing of Windows and Linux applications and fault detection are disclosed.
Replication relies on communicating information between servers. The communication often relies on one of the core networking protocols, such as UDP or TCP. UDP, for instance, transmits messages without implicit handshaking and thus does not guarantee delivery, ordering or data integrity. TCP uses a more rigorous protocol to ensure some level of reliable, ordered delivery of messages, In the event of faults, such as a network or server faults; TCP cannot guarantee delivery, ordering or integrity. The present invention provides a reliable messaging protocol built on either TCP or UDP which ensures ordered, reliable delivery of messages.
Live migration is a technique generally used to move a running application or virtual machine from a primary server to a backup server in response to an operator command or a programmatic event. Live Migration thus happens in response to an external event which allows for a deterministic migration process. The primary and backup server must stay operational during the live migration.
Conversely, if the primary crashes there is no ability to migrate the application or virtual machine to the backup, as the primary no longer is operational. So even though the primary application or VM could have migrated at an earlier time, now that the primary server is gone, the ability to migrate it is gone, too.
Therefore, a need exists for systems and methods for providing live migration of applications and virtual machines in response to both external events and faults. The Live Migration must ensure non-stop operation of the application and transparently switch from the primary to the backup. In the event of a fault, the fault recovery must furthermore continue to service clients even though clients were disconnected from the primary at the time of the fault. Finally the live migration must work on commodity operating system, such as Windows and Linux, and commodity hardware with standard applications.