Field
The present disclosure relates to service availability. More specifically, the present disclosure relates to a method and system for transparently providing high availability to services.
Related Art
High availability enables a system to provide continuous services with minimum or no disruption in a failure scenario. However, supporting high availability in an existing or new system can be complex, error prone, and costly. Consequently, deployment of critical high availability features to systems often face delays and causes unwanted disruption to services. Technology vendors usually provide fault resilient services using an active-standby model. In this model, all services (also referred to as applications) run in an active system and all services requiring high availability (can be referred to as fault resilient services or applications) replicate and synchronize their critical states in a standby system. The active or standby system can be a physical or virtual device. If the active system suffers a hardware or software failure, the replicated fault resilient services in the standby system take over and resume the operations without disruption.
Fault resilient applications running on an active system usually use the synchronization infrastructure provided in an operating system (OS) to replicate state changes to the corresponding standby system. However, in this approach, a respective application is responsible for managing and synchronizing the application states. These states are known only to the application and the application is required to serialize and de-serialize the states. The application is also responsible for sending the states via the operating system synchronization services to the standby system.
Different applications running on a system can have states which have interdependencies. For example, some operations can cause state updates for a plurality of applications. The operating system synchronization service usually does not provide any coordinated state synchronization across these multiple related applications. A respective fault resilient application synchronizes its states with the standby system, independent of any other application in the system. As a result, an application needs to explicitly inform other related applications regarding the state updates.
While high availability brings many desirable features to applications, some issues remain unsolved in providing transparency and coordination to the high availability synchronization process.