1. Field of the Invention
The present invention relates to a method and system for providing cluster replicated checkpoint services. In particular, the present invention relates to a cluster replicated checkpoint service (xe2x80x9cCRCSxe2x80x9d), which provides services for components to maintain checkpoint and its replicas. In so doing, the CRCS allows components to recover promptly and seamlessly from failures, and thus ensures high-availability of services provided by them.
2. Discussion of the Related Art
Networked computer systems enable users to share resources and services. One computer can request and use resources or services provided by another computer. The computer requesting and using the resources or services provided by another computer is typically known as a client, and the computer providing resources or services to another computer is known as a server.
A group of independent network servers may be used to form a cluster. Servers in a cluster are organized so that they operate and appear to clients, as if they were a single unit. A cluster and its network may be designed to improve network capacity, by among other things, enabling the servers within a cluster to shift work in order to balance the load. By enabling one server to take over for another, a cluster may be used to enhance stability and minimize downtime caused by an application or system failure.
Today, networked computer systems including clusters are used in many different aspects of our daily lives. They are used, for example, in business, government, education, entertainment, and communication. As networked computer systems and clusters become more prevalent and our reliance on them increases, it has become increasingly more important to achieve the goal of always-on computer networks, or xe2x80x9chigh-availabilityxe2x80x9d systems.
High-availability systems need to detect and recover from a failure in a way transparent to its users. For example, if a server in a high-availability system fails, the system must detect and recover from the failure with no or little impact on clients.
Various methods have been devised to achieve high availability in networked computer systems including clusters. For example, one method known as triple module redundancy, or xe2x80x9cTMR,xe2x80x9d is used to increase fault tolerance at the hardware level. Specifically, with TMR, three instances of the same hardware module concurrently execute and by comparing the results of the three hardware modules and using the majority results, one can detect a failure of any of the hardware modules. However, TMR does not detect and recover from a failure of software modules. Another method for achieving high availability is software replication, in which a software module that provides a service to a client is replicated on at least two different nodes in the system. While software replication overcomes some disadvantages of TMR, it suffers from its own problems, including the need for complex software protocols to ensure that all of the replicas have the same state.
The use of replication of hardware or software modules to achieve high-availability raises a number of new problems including management of replicated hardware and software modules. The management of replicas has become increasingly difficult and complex, especially if replication is done at the individual software and hardware level. Further, replication places a significant burden on system resources.
When replication is used to achieve high availability, one needs to manage redundant components and have an ability to assign work from failing components to healthy ones. However, telling a primary component to restart or a secondary component to take over, is not sufficient to ensure continuity of services. To achieve a seamless fail-over, the successor needs to pick-up where the failing component left off. This means that secondary components need to know what the last stable state of the primary component was.
One way of passing information regarding the state of the primary component is to use checkpoints. A checkpoint may be a file containing information that describes the state of the primary component at a particular time. Because checkpoints play a crucial role in achieving high-availability, there is a need for a system and method for providing reliable and efficient cluster replicated checkpoint services to achieve high availability.
The present invention provides a system and method for providing cluster replicated checkpoint services. In particular, the present invention provides a cluster replicated checkpoint service for managing a checkpoint and its replicas to make a cluster highly available.
To achieve these and other advantages and in accordance with the purposes of the present invention, as embodied and broadly described herein, the present invention describes a method for providing cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The method includes managing the checkpoint that contains checkpoint information, and creating the primary replica in a memory of the first node. The primary replica contains first checkpoint information. The method also includes updating the primary replica so that the first checkpoint information corresponds to the checkpoint information, creating the secondary replica that contains second checkpoint information in a memory of the second node, and updating the secondary replica so that the second checkpoint information corresponds to the checkpoint information.
In another aspect, the invention includes a method for providing cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The method includes creating the checkpoint, opening the checkpoint from the first node in a write mode, and creating the primary replica in a memory of the first node. It also includes updating the checkpoint, updating the primary replica, and propagating a checkpoint message that includes information regarding the checkpoint. Further, the method includes opening the checkpoint from the second node in a read mode, creating the secondary replica in a memory of the second node, and updating the secondary replica based on the checkpoint message.
In yet another aspect, the invention includes a computer program product configured to provide cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The computer program product includes computer readable program codes configured to: (1) manage the checkpoint that contains checkpoint information; (2) create the primary replica with first checkpoint information in a memory of the first node; (3) update the primary replica so that the first checkpoint information corresponds to the checkpoint information; (4) create the secondary replica with second checkpoint information in a memory of the second node; and (5) update the secondary replica so that the second checkpoint information corresponds to the checkpoint information. The computer program product also includes a computer readable medium in which the computer readable program codes are embodied.
In further aspect, the invention includes a computer program product configured to provide cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The computer program product includes computer readable program codes configured to: (1) create the checkpoint; (2) open the checkpoint from the first node in a write mode; (3) create the primary replica in a memory of the first node; (4) update the checkpoint; (5) update the primary replica; and (6) propagate a checkpoint message that includes information regarding the checkpoint. The computer program product further includes computer readable program codes configured to: (1) open the checkpoint from the second node in a read mode; (2) create the secondary replica in a memory of the second node; and (3) update the secondary replica based on the checkpoint message. It also includes a computer readable medium in which the computer readable program codes are embodied.
In yet further aspect, the invention includes a system for providing cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a primary replica and a secondary replica. The system includes means for: (1) managing the checkpoint with checkpoint information; (2) creating the primary replica with first checkpoint information in a memory of the first node; (3) updating the primary replica so that the first checkpoint information corresponds to the checkpoint information; (4) creating the secondary replica with second checkpoint information in a memory of the second node; and (5) updating the secondary replica so that the second checkpoint information corresponds to the checkpoint information.
In another aspect, the invention includes a system for providing cluster replicated checkpoint services for replicas of a checkpoint in a cluster. The cluster includes a first node and a second node, which are connected to one another via a network. The replicas include a first replica and a second replica. The system includes means for: (1) creating the checkpoint; (2) opening the checkpoint from the first node in a write mode; (3) creating the primary replica in a memory of the first node; (4) updating the checkpoint; (5) updating the primary replica; (6) propagating a checkpoint message with information regarding the checkpoint; (7) opening the checkpoint from the second node in a read mode; (8) creating the secondary replica in a memory of the second node; and (9) updating the secondary replica based on the checkpoint message.
Finally, in another aspect, the invention includes a system for managing a checkpoint. The system includes a first node running a primary component, including a primary replica having first checkpoint information in its memory, having a first checkpoint service, and connected to a network. The system also includes a second node running a secondary component, including a secondary replica in its memory, having a second checkpoint service, and connected to the network. The first checkpoint service and the second checkpoint service are capable of accessing the checkpoint. The first checkpoint service works with the primary component to update a checkpoint, issue a checkpoint message containing information regarding the checkpoint, asynchronously propagate the checkpoint message, and update the first replica. The second checkpoint service is capable of updating the secondary replica based on the checkpoint message.
Additional features and advantages of the invention are set forth in the description that follows, and in part are apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention are realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.