The present invention relates to high availability (HA) computer networks or clusters and, more specifically, to a protocol for use therewith that provides enhanced processing.
The term high availability (HA) computer system (or network or cluster) refers to a group of computers that are interconnected in some capacity such that if one or more fail, the remaining computers take over the processing of the failed computer(s).
Currently, failover or redundancy operations are carried out by application software, such as NCR LikeKeeper or MicroSoft Wolfpack, that is based on an open standard. While the open standard is beneficial in that it allows application software to run on a broad base of computers, current practices are disadvantageous in that a significant amount of supporting code has to be written to permit HA application software to run on any particular computer system. In other words, the current open standard for high availability application software is such that every implementation requires a significant amount of supporting code that couples the HA application software to the particular platform on which the HA application software is to be used.
Another disadvantageous aspect of current HA computer arrangements relates to the use of low level system drivers. Most current clustering implementations utilize industry standard communication protocols, such as sockets, to communicate. The communication between clustered servers, therefore, relies on lower level system drivers and protocols such as NetBIOS, TCPIP, Ethernet and others to service their communication needs. This reliance on such an extensive list of lower level drivers significantly decreases the direct control by the HA application software over the communications paths on which it relies to detect server problems. Additionally, these lower level drivers are not only used by the HA application software, but are also used by other application programs. Thus, it is conceivable that the drivers of all the common/networked communication devices could be busy with other application programs when the HA application software needs to attempt a critical communication.
Yet another disadvantageous aspect of current HA computer arrangements relates to processing of the xe2x80x9cheartbeatxe2x80x9d (HB) signal. The HB signal is a signal that is propagated between computers in a cluster for the purpose of transmitting status information and confirming that each machine is running properly. The HB signal may be propagated over any of the common links of a cluster arrangement and potential links include Ethernet, modem, serial port, parallel port and shared disk links. It is known in the prior art to send a signal over a first link and designate a second link to be implemented when the first link fails. A problem with this approach, however, is that the pre-selected failover or secondary link, may not be the best link at the time of an actual failure. For example, it may be being used by another program, etc., while an adjacent path is available.
It is also known to simultaneously send the heartbeat signal over a plurality of system links (for example, the Ethernet, parallel and serial links). While this procedure increases the probability of a HB signal reaching its intended destination, the procedure is undesirably consumptive of processing resources due to repetitive processing for the multiple links.
A need thus exists for a lower level HA protocol for use in clustered computer arrangements and the like that is efficient, controllable, reliable and secure.
Accordingly, it is an object of the present invention to provide an improved protocol for use in a high availability (HA) computer system.
It is another object of the present invention to provide a computer for use in a HA computer system that implements such a protocol.
It is another object of the present invention to provide a HA protocol that affords an efficient interface between application software and underlying hardware.
It is also an object of the present invention to provide a HA protocol logic that implements features such as node discovery, failed communication re-transmission, message routing, etc.
These and related objects of the present invention are achieved by use of a high availability protocol computing apparatus and method as described herein.
The attainment of the foregoing and related advantages and features of the invention should be more readily apparent to those skilled in the art, after review of the following more detailed description of the invention taken together with the drawings.