The present application relates to a service level agreement aware scheduling of tasks in cloud computing.
Cloud computing has become highly popular due to its cost-effectiveness and ability to provide scalable computing. As cloud computing becoming increasingly more important in database systems, many new technical challenges have arisen. Cloud computing systems are usually profit-oriented and therefore call for treatments different from standard approaches in traditional database systems and computer networks. For the service provider of a cloud computing system, a common business model is to negotiate certain contracts with customers, where the contracts are often in the form of service level agreements (SLAs). Given the SLAs, it is up to the service provider to set up the cloud computing system and to provide the agreed-upon services accordingly. Some key technical components for a cloud computing system include capacity planning (e.g. hardware and software setting), dispatching (e.g., routing queries to different data centers), scheduling (e.g., setting priorities to the queries from customers with different service levels), online system tuning and monitoring (e.g., burstiness and fraud detection), for example.
In traditional systems, commonly used performance metrics such as system response time delay and throughput are not directly profit-aware. In cloud computing in comparison, queries usually come from different customers and these customers have different profit profiles. It is to the service provider's interests to directly optimize the profit instead of, say, to reduce the average response time among all customers.
FIG. 1 shows an exemplary cloud computing system. In this system, a cloud service provider 10 is hosting an online shopping site with server 12 and a database server 14. Queries come from different users: when a query comes from a serious buyer 18 (e.g., a user who has been ready to check out his or her shopping cart), the potential profit of answering the query can be high and the delay should be short; on the other hand, when the query is from a casual user 16 (e.g., someone just uses the shopping site's tools to compare features of different products), the potential profit may be low and longer delay is tolerable; yet another query may come from an internal employee 20 of the shopping site who is collecting some data to make certain real-time decisions (e.g., whether to put certain products on sale), and in such a case a much longer delay is acceptable up to certain threshold (after which a penalty may incur due to the failure to make decision). Another observation from this example is that the workload in a cloud database system is usually a mixture of short queries (e.g., OLTP queries from buyers) and long queries (e.g., OLAP queries from internal employees). As can be seen from this example, because of such variety in customer profit profiles, a service provider should consider the perspectives of individual customers and make profit-oriented decisions in the design of the cloud computing system.
Cloud providers operate under Service Level Agreements (SLAs), which are agreements between a service provider and its customers. SLAs are used to indicate the profits the service provider may obtain if the service is delivered at certain levels and the penalty the service provider has to pay if the agreed-upon performance is not met. There exist many forms of SLAs with different metrics (delay, throughput, etc.) and measurement methods (e.g., measured by per-customer or per-query).
Many system design issues, such as scheduling and dispatching have been extensively studied in various areas such as computer networks, operating systems, and database systems. Traditional system designs, however, have two weak points when directly applied to cloud computing systems. First, instead of distinguishing the profit of each query, traditional systems mainly use system level metrics, such as overall throughput or average response time among all queries, to measure system performance. Second, in traditional computer networking systems, scheduling and dispatching policies are usually simple and only based on local information. Such design principles make sense in a distributed setting where the schedulers are located in individual routers and have to make quick decisions based on locally available information. In comparison, in cloud computing systems, service providers usually have more control on the environment and more sophisticated policies are justified as long as operational profits can be reasonably improved. Most existing scheduling policies either are not profit aware or estimate the profit of each query independent of other queries. In addition, existing techniques in system profiling, query dispatching, and capacity planning usually do not directly use information from scheduling.