This invention relates generally to the field of computer systems. More particularly, a system and method are provided for facilitating a rolling upgrade of distributed software, with automatic completion of the upgrade.
Historically, upgrades of distributed software—software executed simultaneously on multiple computer nodes—have been performed in an all-or-none manner. In other words, either no nodes are upgraded, or else all are taken out of operation, upgraded and then brought back to operation. This is typically due to the inability of the software to function with multiple different versions in operation at one time. Thus, they either all run with the old version of the software or they all run with the newer, upgrade version.
This can be unacceptable in many distributed systems. For example, when some distributed software (e.g., web services, application services, a database) is run only on a specified set of nodes, the software becomes unavailable when all nodes are down. Some organizations or enterprises cannot tolerate such unavailability. And, the more nodes there are to be upgraded, the longer the software is unavailable.
Even if the software is available when less than all nodes are operational, it is unlikely to be operational with multiple versions simultaneously in execution on different nodes. Thus, there will still be an extended period of decreased availability, as one node at a time is brought back to operation with the newer version.