Database as a Service (DBaaS) provides significant benefits for both the customer and service provider. DBaaS allows customers, i.e., tenants, to rent a database instance from a service provider, which relieves the customer from the hassle of provisioning the hardware and configuring, operating and maintaining the database. The service provider, on the other hand, can leverage economies of scale by automating common maintenance tasks as well as consolidating tenants onto the same machine to improve utilization and thereby decrease its operational cost. Consolidation is especially important given that—even for highly-optimized cloud infrastructures—energy consumption is still the dominant cost factor.
When sharing resources among tenants it is challenging to ensure that service-level objectives (SLOs) for tenants are met. Ideally, every tenant has the impression that its database instance is hosted on a dedicated machine with virtually infinite resources and 100% reliable hardware.
A strategy for assigning tenants to servers should therefore ensure that:                (1) tenants have enough available resources per machine and are not impacted by other tenants,        (2) the data is replicated with enough resources on all copies to cover hardware failures and        (3) a tenant's resources are seamlessly adjusted depending on a tenant's needs.        
These cloud applications or platforms often have unpredictable load patterns, such as flash crowds originating from a sudden and viral popularity, resulting in the tenants' resource requirements changing with little notice. Load balancing is therefore an important feature to minimize the impact of a heavily loaded tenant on the other co-located tenants.
Furthermore, a platform deployed on a pay-per-use infrastructure (like Amazon EC2) needs to provide the potential to minimize the system's operating cost. Flexibility, i.e., the ability to scale out to deal with high load while scaling in during periods of low load, is a critical feature to minimize the operating cost. Flexible load balancing is therefore a key feature in the design of modern database management systems for cloud systems and requires a low cost technique to migrate tenants between servers within a server cloud. There is therefore a need to provide placement and migration solutions, which successfully balance running time with solution quality.
Multi-tenancy database services are known, such as Relational Cloud: “C. Curino et al., Workload-aware database monitoring and consolidation”, In Proceeding SIGMOD, 2011 or J. Schaffner et. al., “Predicting in-memory database performance for automating cluster management tasks”, In Proceedings ICDE, 2011 and “Yahoo's platform for small applications”: Fan Yang, Jayavel Shanmugasundaram, Ramana Yerneni: A Scalable Data Platform for a Large Number of Small Applications. CIDR 2009. However, both state of the art systems, Relational Cloud and the Yahoo system use static placement algorithms and do not consider the cost of tenant migrations.
Further, so called interleaved declustering strategies are known (see for example: H.-I. Hsiao und D. DeWitt, “Chained Declustering: A New Availability Strategy for Multiprocessor Database Machines”, In Proceeding IDCE, 1990 and A. Watanabe und H. Yokota, “Adaptive Lapped Declustering: A Highly Available Data Placement Method Balancing Access Load and Space Utilization” In: Proceedings ICDE, 2005). As a disadvantage all declustering strategies, however, assume that a partition (e.g. a table) can be further split into sub-partitions and hence, distributed across servers. Unfortunately, this assumption does not hold in in-memory multi-tenancy application, where a tenant is considered an atomic unit. Furthermore, existing declustering strategies assume a fixed number of servers and replicas, which is not realistic for cloud environments.
US 2010/0077449 discloses a method for assigning tenants to application or middleware instances. Here, the assigning is also based on server resource capacity and constraints. However, this approach does not take into account that the number of servers may vary dynamically. This has the disadvantage that the number of servers may not be reduced as much as possible and thus more costs will be generated. Further, the assigning or mapping according to this disclosure is based on the fact that there is no replication of tenants and, thus, each tenant only exists once, which shows disadvantages with regard to server failures or server overload situations.
A common approach to address the above mentioned goals starts by monitoring each tenant for some period of time on dedicated servers and developing an estimate of their peak resource consumption. This approach is for example disclosed in F. Yang, J. Shanmugasundaram and R. Yerneni, “A Scalable Data Platform for a Large Number of Small Applications,” In Proceedings CIDR, 2009. Based on this estimate a bin-packing algorithm is run to allocate tenants to servers, perhaps folding new tenants into an existing cluster. A bin-packing algorithm refers to an allocation procedure where objects of different volumes (tenants) must be packed (allocated) into a finite number of bins of capacity (database storage servers) in a way that minimizes the number of bins used. Typically, the whole placement is mirrored (i.e., replicated) to ensure fault-tolerance.
FIG. 1 shows such a placement in a fictive example of 5 tenants with different estimated peak resource needs (the servers' capacities are normalized to 1) using a first fit algorithm and a mirroring technique as known from the state of the art. The total normalized load per tenant is: A (0.3), B (0.3), C (0.4), D (0.4), E (0.3). The capacity per server is 1.0. However, this approach has severe disadvantages: First, the cluster has to be substantially over-provisioned as it has to reserve capacity for peak loads. Second, servers are more than 50% underutilized for the normal operational case even at peak times. This is due to the fact that, in the case of a read-mostly workload, the load can be distributed evenly across the copies (e.g., the total load of tenant A of 0.3 is spread across server 1 and 3 in FIG. 1). However, upon the failure of a server, its mirror must take over the entire load. Third, reacting to changing resource requirements and/or improved estimates, is problematic as it requires a re-organization of the placement and typically has to be done off-line because of its operational and performance impact on the cluster.