Modern enterprise/consumer grade applications (distributed applications) may be hosted in several servers (for example, virtual machines (VMs) and/or containers), such as web servers, user interface (UI) servers, business logic servers, database servers, file servers and so on. For example, some of the servers could be hosting databases, some could be hosting web servers, and while others could be hosting various components of a business logic. Generally, these are distinct servers dedicated to performing specific tasks and may not be amicable to software upgrades to be performed in one operation. Such servers may need to be upgraded on a stage-by-stage basis, i.e., server by server basis.
In the above scenario, upgrading software applications with a new version can involve two ways, one being pushing binaries and the other being changing the configuration on the systems and/or migrating/affecting data in some manner so that the data can be used by the upgraded software applications. Generally, pushing the binaries can be safe as upgraded software application can be undone, old binaries can be recovered and also may not corrupt the data. However, sometimes, the binaries may have a software bug that may corrupt the data stored by the software application in various databases, files and stores. Further, the upgraded software application may have a bug and changing the configuration and/or migrating the data may lead to data corruption, which can be risky during operation and may not be recoverable. Furthermore, in such a scenario, an upgrade failure in any one of the servers may result in downgrading other servers that have been already successfully upgraded, because the software application can be brought online while the failed upgrade is debugged and resolved. This may result in bringing a new server to the “old” state of the failed server so that an upgrade can be re-attempted and the process may have to be repeated until the upgrade is successful.
Existing solutions require taking manual snapshots of the APPVMs and containers where the application is being upgraded. This may require the IT administrator to determine all the APPVMs and/or containers that are likely to be impacted during the upgrade. Further, depending on the virtualization solution being used, appropriate command-line interfaces (CLIs)/application program interfaces (APIs) may be called on the hypervisor to trigger a snapshot. Such a manual process can be prone to errors, for example, the IT administrator may miss taking a snapshot of an impacted APPVM and/or container due to any missed or unknown dependencies as there can be several hundred or thousand APPVMs and/or containers in a virtual datacenter. Furthermore, the IT administrator may need to write complex logic to determine APPVM and/or container dependencies. In addition, the IT administrator may need to write code to take the needed snapshots for upgrading. Moreover, the written code can have bugs that may result in errors in dependencies and snapshots. In addition, the IT administrator may need the knowledge of the hypervisor and also be familiar of the snapshotting mechanism to write the code for taking all the needed snapshots.