In recent years, software engineers have focused on building global-scale Internet applications. These applications include web services that provide users with access to data and functionality such as maps, email, news, and social networking information. Web service providers often provide application programming interfaces (APIs) so that software developers can have controlled access to methods and data from web services.
For example, a web service provider may create a mapping service and provide a Map API for software developers to access the mapping service's functionality. The Map API may contain information about the mapping service including methods to obtain: directions to a location, the travel distance between locations, the travel time between locations, and a location's elevation. If a software developer is building an application for a fast food restaurant, the developer may want to use the Map API to allow a user to request directions to the restaurant from the user's current location. The developer does not have to write the map-specific code to obtain directions, but can instead use the Map API to access the mapping web service's functionality and obtain directions.
Individual web services may handle global user traffic of varying quantities and from various sources depending on the data and functionality that each web service provides and the number of applications that access the web service. In order to support global user traffic and respond quickly to user requests, multiple instances of a particular web service may need to be running on computing devices in multiple datacenters that exist in several locations. An example datacenter is illustrated in FIG. 5.
As shown in FIG. 1, large-scale distributed systems may provide web services and allow multiple applications, users, or computing devices to access the web services. The distributed system may use a client/server architecture in which one or more central servers store services and provide data access to network clients.
FIG. 2 illustrates a block diagram of an exemplary distributed system 200 for providing data in a large-scale distributed system. The system 200 includes a plurality of user terminals 202 (e.g. 202-1 . . . 202-n), each of which includes one or more applications 204 (e.g. 204-1 . . . 204-n), such as an Internet browser. The user terminals 202 are connected to a server 206 and a plurality of computer clusters 210 (e.g., 210-1 . . . 210-m) through a network 208 such as the Internet, a local area network (LAN), a wide area network (WAN), a wireless network, or a combination of networks. The server 206 may include one or more memory devices 214, and one or more CPUs 216. There may be global load balancing, which routes requests to particular datacenters, from a load balancing engine 212.
Each of the user terminals 202 may be a computer or similar device through which a user can submit requests to and receive results or services from the server 206. Examples of the user terminals 202 include, without limitation, desktop computers, notebook computers, tablets, mobile devices such as mobile phones, smartphones, personal digital assistants, set-top boxes, or any combination of such devices.
In order to support global-scale access to web services, a conventional technique is to create more than one instance of a particular web service. Having multiple instances allows copies of a web service to be stored in more than one datacenter by maintaining copies (or instances) of the same web service so that the web service can tolerate datacenter failures. If a datacenter that contains a particular web service instance is unavailable, the web service can be accessed from other instances stored or located at alternate datacenters.
Determining the number of web service instances, the locations of the instances, and the capacity required at each location requires complex planning and system administration work. These determinations must take into consideration various constraints on web service instance placement in order to ensure good quality of service for all users trying to access a particular web service. Example constraints include: the number of users of a web service in specific locations, the number of web service requests and computational resources required for handling the requests, internal dependencies of a web service on other systems, the locations and capacities of datacenters in which the web services are executing, failure rates of infrastructure in the datacenters, the network latency/bandwidth/quality among datacenters or datacenters and end users, and the speed and ease of deploying new versions of web services. Web service instance placement and capacity in a particular location are also subject to cost optimization, where marginally better quality of service is often traded for less expensive placements.
Decisions regarding the number of web service instances and web service instance placement for a particular web service are customarily decided manually for each web service. This process can be time-consuming and prone to errors. Additionally, the process does not change dynamically when a data center goes down, or when internal or external dependencies change or move. Therefore, as recognized by the inventors, there should be an automated process that determines the placement of web service instances and the capacity of each web service instance to improve the quality of service for users while minimizing the cost of running the web services.