The application relates to load balancing for multi-tenant databases.
Cloud computing has revolutionized the IT industry, with the promises of on-demand infrastructure. The success of cloud service providers mainly depends on economies of scale, where the workload consolidation is a key factor. Cloud services for databases have successfully adopted this strategy of consolidation, where multitenant databases are one of the key enablers. For example, one of the studies in multitenant DBs has reported between 6:1 and 17:1 of workload consolidation. Because of such great potentials, in recent years, multitenant DBs have received much interests from the database research community. Extensive research has been conducted on topics such as exploration on the replication and consistency, DB multitenancy options, in-memory multitenant DB, hotspot mitigation through live migration, and multitenant SLA management.
Multitenant databases achieve cost efficiency through consolidation of multiple small tenants. However, performance isolation is an inherent problem in multitenant databases due to the resource sharing among the tenants. That is, a bursty workload from a co-located tenant, i.e. a noisy neighbor, may affect the performance of the other tenants sharing the same system resources.
When a tenant receives an increased workload, either temporarily or permanently, the neighbor tenants within the same server will suffer from the increased total workload. There can be many causes for the increased workload, where some examples include: i) the growth of a company, leading to a permanent traffic growth, ii) predicted infrequent traffic changes of a tenant, e.g. the bursty query traffic at a Web site dedicated to the World Cup, iii) predicted frequent traffic changes, such as daily or weekly workload pattern of a company, iv) unpredicted traffic spikes by a flash crowd, or any combination of these. Whatever the causes or the patterns are, the impact of such overloading can be highly damaging: neighbors of a noisy tenant can immediately see violations on their performance Service Level Agreements (SLAs), and in a severe case, the server and all tenants therein may become unresponsive.
To avoid this problem, some well-known workload management techniques can be used, including admission control and query scheduling. Admission control rejects certain queries based on some criteria, such as server capacity or per-tenant SLAs. Although server overloading can be avoided with admission control, a portion of workload would be rejected. Query scheduling is another method, which delays certain queries based on the scheduling policies of choice, such as simple First-come-first-served (FCFS) or more sophisticated SLA-based scheduling policies. Scheduling may work nicely with a short transient overload, but it cannot resolve a prolonged overload.
Compared to these methods, load balancing is better suited for addressing non-transient workload overloading due to workload unbalance. One commonly used method for load balancing is through data migration, which moves one or more tenants from an overloaded server to another with a lower resource utilization. Unfortunately, migration involves costly data movement, especially within the shared-nothing environment of our interest. First, data movement consumes resources on the source and the destination servers, along with network bandwidth, which temporarily elevates the resource contention on all of them. Second, data movement takes time proportional to the data size, as the data of a tenant has to be completely moved to a new server before the load balancing can be achieved.