Computer systems are frequently used in enterprise environments to provide services to customers. Example services include providing telecommunication networks, Internet-based shopping services, storage services, and so on.
Whatever the service concerned service providers are generally under an obligation to ensure that an agreed quality of service is provided to their customers over the period for which the service is to be provided.
Service level agreements (SLA) are generally used between service providers and their customers to define an agreed quality of service for a particular service over a predetermined period of time, referred to hereinafter as the evaluation period. An SLA typically defines multiple service level objectives (SLO) that must be met in order for the SLA to be considered complied with. An SLO typically defines an SLO quality objective, such as maximum acceptable server response time, a maximum acceptable transaction processing time, a minimum acceptable network bandwidth, etc. An SLO may also have a required SLO compliance level (often shortened to simply SLO compliance) which indicates the percentage of the evaluation period that the SLO quality objective must be met. So, an SLO may define, for example, that a maximum acceptable server response time is 10 ms (SLO quality objective) and that this must be complied with for 97% of the evaluation period (SLO compliance level).
SLOs may be hierarchically arranged, for example an SLO compliance level defining that a server must be available for at least 98% of the evaluation period may be dependent on an SLO quality objective defining the minimum amount of free memory, the maximum CPU load, etc.
To determine whether an SLA has been complied with at the end of an evaluation period it is necessary to monitor relevant components or configuration items (CI) of the IT infrastructure, as defined by the various SLOs, and to periodically record performance data during the evaluation period. Solutions, such as the Hewlett-Packard's Service Level Manager product, part of the Hewlett-Packard OpenView suite of applications, enable performance data to be collected from components of an IT infrastructure.
The recorded performance data can then be analysed to determine, at the end of the evaluation period, whether the SLA was complied with.
Since SLAs generally impose contractual obligations on a service provider, failure to comply with an SLA can lead to contractual penalties being imposed. However, whilst meeting SLA requirements is important to service providers, the cost of providing the service is also important.
For example, whilst it is possible for service providers to over specify components in an IT infrastructure to help ensure that an SLA is complied with, over specifying components typically comes at a price. Furthermore, over specifying components may lead to unnecessary redundancy. On the other hand, under specifying components may typically result in lower initial cost but may also put SLA compliance at risk and hence increase the risk of incurring penalties.
Due to the general complexity of IT infrastructures and the often complex interdependencies of IT infrastructure components as well as the way that an SLA may be dependent of multiple hierarchical SLOs, and each SLO may in turn be dependent on multiple configuration items, it is inherently difficult to accurately determine where improvements in or modifications to the IT infrastructure can be made.
Accordingly, one aim of the present invention is to overcome, or at least alleviate, at least some of the above-mentioned problems.