Seemingly overnight, the Internet has become the vehicle of choice of a growing segment of the population for satisfying people's needs for goods and services, entertainment and so forth. As a result, builders of web sites and other computer systems today face many systems planning issues. These include capacity planning for normal growth, expected or unexpected peak demand, availability and security of the site and so forth. Companies that wish to provide services on the Web have new business and service models, which are the areas in which they want to innovate and lead, but in order to do so they must confront challenges posed by the non-trivial complexity of designing, building and operating a large-scale web site. This includes the need to grow and scale the site while it is operational.
One response to these issues is to host an enterprise web site at a third party site, co-located with other web sites of other enterprises. Such outsourcing facilities are currently available from various companies. These facilities provide physical space, and redundant network and power facilities so that the enterprise customer or user need not provide them. The network and power facilities are shared among many enterprises or customers in an effort to attain economies of scale.
The users of these facilities, however, must still face tasks relating to managing a computing infrastructure in the course of building, operating and growing their facilities. Information technology managers of the enterprises hosted at such facilities remain responsible for selecting, installing, configuring, and maintaining their own computing equipment at the facilities. The managers must still confront difficult issues such as resource planning and handling peak capacity.
Web sites differ in internal topology. Some sites simply comprise a row of web servers that are load balanced by a load balancer. The load balancer divides the traffic between the servers to maintain a balanced processing load on each server. The load balancer may also include or may be coupled to a firewall for protecting the web servers from unauthorized traffic. Other web sites may be constructed in a multi-tier fashion, whereby a row of web servers handle Hypertext Transfer Protocol (HTTP) requests, but the bulk of the application logic is implemented in separate application servers. These application servers in turn may need to be connected back to a tier of database servers.
Given the diversity in topology of the kinds of web sites that may need to be constructed, it may appear that the only way for constructing large-scale web sites is to custom build each one. Indeed, this is the conventional approach. Many organizations are separately struggling with the same issues, and custom building each web site from scratch. This involves a significant amount of duplicate work at different enterprises.
Other issues faced by designers of web sites are resource and capacity planning. A web site may receive vastly different levels of traffic on different days or at different hours within each day. At peak traffic times, the web site hardware or software may be unable to respond to requests in a reasonable time because it is overloaded. At other times, the web site hardware or software may have excess capacity and be underutilized. Finding a balance between having sufficient hardware and software to handle peak traffic, without incurring excessive costs and having over-capacity remains allusive. Many web sites never find the right balance and chronically suffer from under-capacity or excess, capacity.
One approach to adding capacity quickly is to maintain a single copy of data images for the web site on a disk from which the image may be replicated on to other disks on demand. In one scenario, it is desirable to make many copies from a single source disk to be replicated as multiple destination disks. Each copy is executed from the beginning of the source disk to the size of the image on it. While this addresses extensibility issues, opportunities for further improvement exist. For example, often copy processes may be initiated at different times. This can result in inefficiencies since each instance of the copy operation reads from the source and copies to the destination. This type of operation can cause head contention in the storage units and greatly diminish the overall performance of all the copy operations.