In cases where it is necessary to guarantee that a service be performed uninterruptedly or with a minim number of interruptions, server software upgrades are especially delicate, as service unavailability may result.
In this connection, and in order to ensure a certain degree of fault tolerance, the commonly used solution is to replicate the servers. When an upgrading operation is performed, the server or servers involved in this operation are made temporarily unreachable, typically by eliminating databases whereby these servers can be located, for example by means of a Domain Name Server (DNS) mechanism, and transferring the load to the servers which remain active.
The main disadvantages of this solution are as follows:
It is not always easy to determine when a server can be deactivated: it is necessary, in fact, to ensure that there are no transactions in progress, and that any local copies of the server lists in the units acting as clients do not contain references to the server in question; Another problem which is directly connected to this disadvantage is that administrative operations are always fairly slow, and in particularly urgent cases (e.g. when it is necessary to deal with potential security problems), the only possibility is to interrupt service temporarily. In addition, if the duration of the state maintained by the process exceeds that of a transaction and (as is frequently the case), the end of its valid lifetime cannot be readily identified, the state may be lost, causing temporarily malfunction, and Finally, if (as occurs in most cases) it is difficult to predict traffic trends and a traffic peak takes place during the upgrading operation, the servers which remain active may encounter difficulties in handling their workload: the only way to prevent this problem is to deploy at least one machine in addition to those which would be sufficient under normal circumstances, which entails an increase in system redundancy which would not be strictly necessary otherwise.