Networked computing resources are frequently configured so that processing is distributed across the resources, such as with a large set of nodes configured in a cluster environment or grid environment. As a result, related application deployment and life-cycle management activities have become more complex, cumbersome and error-prone. For example, configuring a large network of nodes to execute instances of a database server that all share a common database is typically a manual process.
With one prior approach, this manual deployment process generally involved several steps, as follows. First, an operation for an application (e.g., installing, upgrading, patching, uninstalling) is performed on a first node of the network, from the first node. Then, it is verified whether or not the operation was performed successfully on the first node. The foregoing steps are then performed for each other node of the network, in sequence, from the respective node. Finally, the nodes in the network are configured so that each node is aware of each other node (in the case of an initial network provisioning process), e.g., each node is informed that it is part of a cluster, each node is informed of each other peer node, each node is informed that it should treat each other node in the network equally, and the like.
Advances have been made relative to the manual approach, by enhancing the deployment process to include some automation of the process. One prior approach involves the following deployment process. First, the network of nodes is configured so that each node is aware of each other node. Then, one of several phases of the operation is performed on any node in the network, from that node, e.g., files are copied to the node. Next, automatic propagation of that phase of the operation to each of the other network nodes is triggered from the first node. This automatic propagation includes, for example, determining a list of files that need to be copied to each of the other nodes in the network, and automatically copying those files to the other nodes. The steps of performing a next phase of the operation and automatically propagating that phase to the other network nodes, for each of the other phases of the operation, are repeated. For example, subsequent phases of the operation may involve moving the copied files to appropriate directories, updating registries, maintaining an inventory of applications for each respective node, and the like.
The foregoing automated approach to the deployment process eliminated previously required manual steps, however, this automated approach has room for improvement. For example, in the event that a failure occurred anywhere in the process (i.e., at the first node or at any of the other nodes on which any phase of the operation is performed), then the completed portion of the process needs to be undone and the entire process started over again once the reason for the error is diagnosed and remedied. Hence, this prior automated approach to the deployment process is not very resilient or fault tolerant.
Additionally, this prior automated approach is inflexible. For example, once the deployment process is initiated, all of the tasks associated with the process are automatically performed (or at least attempted) without any management or control afforded the administrator except aborting the entire process and starting over. Furthermore, once a network of nodes has been configured as a cluster or grid, the deployment process is automatically performed on every node in the network. For example, when updating an application on the nodes in the network, it was necessary to disable all the nodes, update the application on all the nodes, and then enable all the nodes. This results in the entire system being disabled for a significant period of time while the application is updated on all the nodes and, consequently, the system is unable to service requests during that time period.
Based on the foregoing, there is room for improvement in processes for provisioning, including deploying software on, a managed network of nodes operating as peers.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.