A hosted center provides computing services to multiple customers. Each customer is allocated a subset of the service provider's infrastructure resources, such as servers, to meet its needs. Customer needs change over time, and in particular, peak demands for resources can exceed average demands by orders of magnitude. A simple static allocation of resources to satisfy peak demand leaves resources underutilized. Reconfiguring infrastructure resources dynamically, in response to customer needs, demands prompt attention from administrative personnel and could require moving hardware, increasing both operational costs and the risk of failing to provide adequate service. The problem for the hosted center is how to respond quickly to changes in customer needs such that infrastructure resources and staff are employed in an efficient and cost effective way. Computing utilities attempt to address this problem by automating the creation and management of multiple computing services on shared dynamically allocatable infrastructures.
Previous work in computing utilities varies in the types of services that are offered, the resources that are used, and the extent to which operation is automated. Operations subject to automation are wide ranging, and include creating services, deploying a service for a customer, modifying the set of resources used to provide the service, and incorporating new resource instances and types into the hosted center and its services.
A previous system automated the provisioning of front end servers in web sites based on metrics such as server load and response time. It included a component that discovered server and network topology automatically. Another system provided a variety of multi-tier web sites in which pre-configured servers could be allocated to different tiers automatically based on metrics such as server load. Yet another system also allocated server resources in response to server load, but modeled both the value of allocating resources to each customer and the cost of employing those resources, with an emphasis on energy cost. More recent work includes allocation of other resource types such as memory and storage, and allocation of servers for general use.
At the application layer some systems have a framework for deployment and management of distributed applications. An application is described as a collection of related, reusable components, which may represent resources or subsystems. The description includes dependency information to ensure that, for example, components are started in the correct sequence. Once deployed, applications may be monitored, and actions may be specified in case of component or resource failures, such as automatic fail over or restart. Such a system is not used for low level resource configuration tasks such as installing operating systems on servers, but for higher level application specific configuration.
A growing number of industrial products aim to provide multi-tier applications over a physical infrastructure consisting of a variety of resources, such as those from Hewlett Packard, ThinkDynamics, Sun Microsystems, and Jareva. They vary in many respects, such as the types of resources provided (e.g., servers and storage); specific operating systems and middleware supported; assumptions and characteristics of the network infrastructure (e.g., whether or not network isolation is provided via VLAN); level of monitoring support (e.g., resource usage, failure detection, SLA, threshold based alerting); support for resource discovery; support for modifying service resources once deployed; whether modifications can occur automatically (e.g., triggered by SLAs); and the extent to which the products can or must be customized to fit preexisting hosted center infrastructures.