Hosted, multi-tenant computing systems (also known as “cloud computing” systems) offer several advantages over conventional enterprise-specific hardware systems. For one thing, a single cloud system can provide computing resources for a number of customers (also referred to herein as “tenants”), allowing the capital costs for hardware (and, in some cases software) to be shared by multiple enterprises. Moreover, in some implementations, a multi-tenant hosted system can provide resources to each tenant on-demand. In this way, the hosted system can provide resources as needed (including additional resources at peak times) without requiring the tenant to pay for peak resources at all times.
One problem that arises in a multi-tenant system, however, is the scheduling and prioritization of workloads. For instance, in a typical multi-tenant system, several hundred independent requests might arrive simultaneously from different tenants. These requests can range from queries about the status of a tenant's multiple environments, to provisioning new services, to modifying existing services, to creating a ticket or work request. Each tenant has an expectation that its request will be carried out in a timely manner; however, the systems that make up these services often do not have the ability to handle requests from multiple tenants simultaneously. In addition, when maintenance activities happen during the scheduled maintenance window, these systems have no way to manage the incoming requests and determine if they can be executed or need to wait until the maintenance completes.
Hence, there is a need for more robust control over the scheduling and prioritization of workloads in a multi-tenant system.