Distributed computer systems typically include a number of separate hardware and software nodes connected by a network. Each separate node runs software to process various operations as required by the users of the system. Usually, a number of nodes of a distributed computer system execute user requests in parallel. Such architecture gives several advantages of the distributed computer systems over the standalone computer systems. One of the advantages is continuity of operations or resilience. If one of the nodes fails, the user requests are handled by the rest of the nodes of the distributed computer system. Another advantage is scalability. The number of nodes of a distributed computer system could be easily increased or decreased as required by the operative load of the system in different periods.
The resilience and scalability of distributed computer systems makes them very popular for providing various enterprise services. Distributed computer systems are also applied for running mission critical business applications. In recent years, enterprise services and all online computer services in general have become an area of high competition. Accordingly, the requirements for the operability of the computer systems are very strong especially with respect to continuity of operations.
Distributed computer systems, as any other computer system, exit operational mode in the periods for installing or applying software upgrades. During its lifecycle, a distributed computer system is regularly upgraded for multiple reasons, e.g. found bugs, inefficient processing, statutory changes, etc. The downtime that is caused by the installation of software harms user satisfaction, especially for mission critical enterprise applications. On the other hand, prolonging the periods between software upgrades could raise issues with the functionality of a mission critical system.
The increasing complexity of the computer systems require shorter periods between upgrades. On the other hand, the competition and the growing user demands require minimizing downtime periods. However, there is still no robust and universal solution that allows installation of software upgrades on distributed systems with no downtime.