In a high-availability clustered client-server environment, upgrading or patching individual server nodes can be done in a rolling manner by upgrading one node at a time. Generally such a rolling upgrade regime can be performed without total outage, however at least some database services (e.g., database connections) need to be at least temporarily disconnected while its corresponding node is subjected to upgrade and patching operations.
Given a high-availability cluster and a regime where a client has at least some legacy fail-over capabilities, a client that was initially connected to a server and then becomes disconnected would try to select a new server in the cluster whenever it detects loss or certain impairments of the connection.
Unfortunately legacy techniques fail to consider a sufficiently full range of criteria when selecting a new server in the cluster, and in some cases, an ill-selected node onto which to fail-over can cascade into yet another fail-over situation, and another ill-selected node onto which to fail-over can cascade into still yet another fail-over situation and so on. Yet, the time period during fail-over operations can cause brownouts or outages, and in some cases brown-outs can affect mission-critical facilities.
What is needed are techniques that account for dynamically-changing cluster conditions during performance of rolling patch installations, and respond by intelligently selecting fail-over nodes during the rolling patch installation in order to minimize down-time or brown-out time.
Moreover, none of the aforementioned technologies have the capabilities to perform the herein-disclosed techniques for intelligently selecting fail-over nodes during a rolling patch installation. Therefore, there is a need for an improved approach.