1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for processing data. Still more particularly, the invention relates to a method, apparatus, and computer program product for maintaining service reliability in a data center using a service level objective provisioning mechanism.
2. Description of Related Art
Modern data centers may contain hundreds if not thousands of resources, such as servers, client computers, software components, printers, routers, and other forms of hardware and software. To save money and operating overhead, a data center operator will generally maintain close to a minimum number of resources needed to operate the data center to a degree desired by the operator. Thus, problems may arise when even one resource fails. For example, the data center may fail to provide service to one or more users or may provide service more slowly.
To solve this problem, a pool of spare resources is maintained. The data center operator may maintain a pool of spare resources, or a third party vendor may provide access to a set of resources on a contract basis. In the latter case, the contract is often referred-to as a service level agreement. If one or more resources fail, perform poorly, or are overloaded, situations are created that may be referred to as a breach, then spare resources are activated, configured, and assigned to the data center as needed.
A problem with this approach is that while the spare resource or resources are being activated and configured, the data center suffers degraded performance or may even be down. Thus, more efficient methods for managing spare resources are desirable.
Because the data center may be very large or complex, automated systems have been designed to monitor the data center and scan for breaches. For example, monitoring agents may be installed on resources in the data center. The monitor agents periodically collect performance data, such as resource utilization or resource failure status, and send the performance data to a data center automation system. An example of a data center automation system is Tivoli Intelligent Orchestrator®, provided by International Business Machines Corporation™. The data center automation system analyzes the performance data for each resource in the data center. The system aggregates the data and uses performance objectives specified in the service level agreement to make recommendations regarding balancing resources in the data center.
However, prior methods for managing a data center may fail if a server or other critical resource in the data center is down. In this case, it may not be possible to use performance data to measure the reliability of a cluster in the data center. For example, a data center has two servers serving an application. The first server is the main server and the second server is a backup server. When the main server is down, the backup server is used to replace the main server.
In this case, CPU (central processing unit) utilization is the same after the backup server takes over, because usually the backup and the main servers have about the same capabilities. For purposes of this example, CPU utilization is the primary measure of reliability in the data center. Thus, the automated data system manager may not evaluate the risk associated with not having a second backup system available in case the first backup system fails.
In addition, making automatic decisions for provisioning resources between multiple applications in a data center can be difficult when different disciplines, such as performance, availability, and fault management, are monitored and wherein a variety of monitoring systems are used. The complexity of the data center and of a monitoring scheme can make provisioning resources a difficult task. Accordingly, it would be advantageous to have an improved method, apparatus, and computer instructions for automatically maintain service reliability in a data center even when detecting a risk of breach is difficult.