In a large enterprise, such as a corporation, computing resources are interconnected by a network of computer systems owned by the enterprise, which fulfill the enterprise's various computing needs. This network can span diverse geographical locations. Internal users (employees) as well as external clients (customers) of the enterprise need the enterprise's computing resources to be highly available and yet also highly secure. In reality, sometimes these two requirements work against each other. For example, to make the computing resources secure, the computing resources have to be brought down often to install security patches. On the other hand, to make the computing resources readily available with continuous frequency, only rarely should the computing resources be brought down. FIG. 1 illustrates these and other problems in greater detail.
An enterprise network 100 includes a client 102, which is a computer through which a user accesses shared computing resources interconnected via the network 104. These computing resources of the enterprise network 100 are provided by one or more servers, such as a server A 106, on which an on-line service is running. To make the on-line service executing on the server A 106 more secure, the enterprise often sets, as a matter of policy, for an administrator 110 to patch the on-line service with security fixes for reported or discovered vulnerabilities. Such patches are carried out by installing and applying the patch to a copy of the on-line service (updated service) running on a server B 108, which acts as a test machine. The administrator 110 verifies and validates the updated service in accordance with the computing policies of the enterprise to make sure that upon deployment, the updated service would be unlikely to cause problems. The server B 108 on which the updated service is tested is physically a different machine from the server A 106 on which the on-line service is providing services to the client 102. After the updated service has passed the testing on the server B 108, the administrator 110 deploys the patch by bringing down the server A 106 for some period of time during which the on-line service is no longer available to the client 102. The patch is then applied to the on-line service after which the server A 106 is brought back up to provide services to the client 102 again.
The problem with bringing down the server A 106 so as to install the patch is that the service context is lost at the time the server A 106 is brought down, and remains inactive. The service context is a state in which the client 102 has provided requests or some information to the on-line service running on the server A 106. There is an expectation by the client 102 that the on-line service will service the request or provide some computation in connection with the provided information. When the on-line service is brought down, such service context is destroyed. When the server A 106 is brought back up again with the patched on-line service, it is unlikely that the server A 106 can remember what the client 102 has previously provided because many services either do not persistently store the service context or the delays caused by the reboot of the server A 106 is too long to be acceptable to the client 102. For example, the client 102 may have sent a search query to the on-line service just prior to the server A 106 to be brought down. When the on-line service is active again with the patch, no response to the prior query will be provided by the on-line service, hence confusing the client 102. Another solution, albeit an expensive one, is to run the on-line service on a cluster-based server platform with redundancy built into the server by adding additional processing capacity to mirror the server 106, but this raises not only the costs of procuring equipment but also the costs of operating the equipment.
Without a resolution to the problem of satisfying the growing requirements of highly secured software while making the highly secured software highly available to users, users may eventually no longer trust the enterprise network 100 to provide a desired computing experience, causing demand for the enterprise network 100 to diminish in the marketplace. Thus, there is a need for a system, method, and computer-readable medium for dynamically updating software while avoiding or reducing the foregoing and other problems associated with existing systems.