Computer networks are typically deployed with large resource pools in order to provide services to end-users. A resource pool is a pool of identical information technology (IT) hardware and other resources (“resources”) comprising, for example, application servers, database servers, load balancers and processors communicatively coupled to the network.
A resource pool can be divided into a production pool and a spare pool. A production pool is the pool of resources operating in the network. A spare pool is the pool of resources that are on stand-by for swapping into the production pool upon failure of a production pool resource, or for increasing or decreasing the production pool resources. The spare pool may also include unhealthy resources taken from the production pool and are awaiting repairs, and healthy resources that are available to be configured into the production pool.
In addition to the resource pools, computer networks are also provided with a resource pool sparing-plan and a support-plan as a part of the overall strategy to support end-users. A sparing-plan is a plan that specifies the number of spare resources in the spare pool. A support-plan is a contract with a support provider that sets forth the type of support and frequency of support to be provided for resources in the production pool and spare pool. For example, a “2 spares/six hour call-to-repair” support-plan is a plan for a spare pool of 2 devices, and for any needed repair and recovery of a resource pool device to be completed within six hours of receiving a call for repair.
In implementing a strategy to support end-users, an important requirement is that end-users are provided with a choice of sparing and support-plans from which they may select the combination plan that best suits their needs. It is also an end-user requirement that the plans selected will execute with a predictable level of probability that all the resources in the production pool are operating properly, that the support will be timely, and that the cost of the plan is acceptable. Thus with the proper choice of sparing and support-plans, end-users can anticipate with a reliable probability that all the resources in the production pool will be operating properly, at an acceptable cost.
In the prior art, support organizations responsible for developing the support plan typically relied on a combination of quantitative and subjective factors in making their decisions. Thus, for example in the prior art, the support organization relied on “gut feel” and “educated guesses” and on quantitative factors such as resource failure rates, resource recovery rates and expected end-user demand, to establish the spare pool size and support plans.
While the prior art approach was successful to some extent in providing support plans, a problem with this approach was that since the plans were based on subjective factors, the plans could not provide a reliable estimate of the probability that all resources in the production pool are operating properly.
Another problem was that since the prior art plans were dependent on subjective factors, the prior art plans were not consistent in specifying an optimum sparing and support plan for the same resources in the network.
A further problem was that since the prior art plans were not fully automated, it was difficult and tedious to generate a choice of alternative plans from which the end-user may chose.
With the increasing complexity of networks and the increasing cost of providing resource pools and support-plans, it is becoming apparent that there is a need for a better way to choose the right mix of sparing and support-plans such that the networks are operated with a predictable level of reliability, at an acceptable cost.
Further, with the availability of a Utility Data Center (UDC) as discussed herein, where rapid switching of resources between the spare pools and production pools is possible, there is a desire for the sparing and support planning process to analytically incorporate the switching or reprovisioning rate together with spare resources, failure rates, and recovery rates. For example, analytically consider the novel concept of developing a sparing and support plan based on the simultaneous relationship between a sparing-plan that allows for the rapid switching of resources between the spare pool and the production pool, and a support plan where repairs occur on unhealthy devices in the spare pool that have been reprovisioned out of the production resource pool.
Accordingly, in view of shortcomings of the prior art, it is an objective of the present invention to provide for a better way to develop sparing and support-plans such that the network will operate with a predictable degree of reliability that all the resources in the production pool are operating properly, at an acceptable cost. Also, in view of the desire to leverage the use of automated planning tools in managing networks, such as those available in a Utility Data Center, it is an objective to provide for an alternative approach that will utilize these tools.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art on reading the following detailed description of preferred embodiments in conjunction with the various Figures.