In many fields of technology it is important to be able to efficiently allocate resources to the demands on those resources. In particular, it may be desirable to allocate resources in a way which minimizes operational costs whilst maintaining a desired level of service (i.e. meeting the demands and ensuring timely processing of the demands). A number of problems arise in respect of this general proposition, principally in relation to dimensioning (how many/how much resource(s) are/is required in order to ensure that the demands are met) and scheduling (how the available resources should be allocated in order to meet the demands).
Examples of fields in which such problems arise are numerous, and include the management of demands on available electrical power (or other utilities such as oil, water or gas), the allocation of computing resources (e.g. processing power and time or memory resources) to tasks or storage requirements, the allocation of bandwidth in a communications pathway such as a telephone or computer network to users of that pathway, as well as the management of demands on other finite resources such as production capacity in a factory or beds in a hospital.
Fluctuations in the demand are likely to impact negatively on predictability, making planning ahead very difficult. In the computing field, this is part of the rationale for “on-demand” or “utility” computing, such as provided by “cloud computing”.
In respect of scheduling, the allocation of resources at different times may be relatively more or less desirable due to differences in the “cost” (which may be financial or technical) of providing the necessary resources at a particular time. This can depend on external factors, such as the price of electricity varying over a 24 hour cycle.
The concept of optimal resource allocation can be most readily understood in financial terms, although it is equally applicable where “costs” are, for example, power consumption, network efficiency, etc.: if certain conditions are met (including in terms of the pricing model), there will be, at any time, a given amount of resources for which the difference between the income generated by and the operational expenses incurred from committing said amount peaks. This particular value is obviously a “sweet spot”. This can arise where the “price” for the execution of a task is fixed, but the “cost” of the resources being used increases with total volume of resources being allocated (this is a typical offer vs demand scenario in which the price per unit increases as a finite resource becomes scarcer). Alternatively, the resource “cost” may be fixed but the “price charged” for execution decreases (in financial terms this could arise if a customer is given an incremental discount when increasing use of the service). Combinations of such balancing may exist, which may result in multiple “sweet spots”.
The present invention aims to provide a way of controlling the allocation of resources so as to shape the load such that, at any given time, the amount of resources needed to meet the agreed quality of service target for the accepted requests comes as close as possible to the instantaneous ideal value.
Although methods of allocating resources are known, these methods tend to either be determinative (i.e. calculate the optimized allocation in advance) and so do not take account of real-time conditions in the system, or if dynamic (i.e. operate at run-time), do not result in significant improvements in optimizing resource use.
Examples of known dynamic (also referred to as “run-time” or “reactive”) scheduling strategies are set out below.
“First Come-First Served”: this strategy allocates resources to a task as soon as it arrives/is demanded. As a result, there is no optimization of the allocation of resources and the result is resource hogging by long-running, non-urgent processes and sub-optimal optimization for shorter processes. This is the simplest form of run-time scheduling and so is generally treated as the benchmark for alternative optimization approaches.
“Predictable Maximum Waiting Time”: this strategy causes the priority of an unscheduled task to increase over time until it is eventually allocated the necessary resources. This ensures that a task is performed within a predictable time period from arrival, but provides no optimization of resource utilization. In particular, an overdue process may end up having sufficient priority to be executed at a time which is very sub-optimal in terms of resource utilization.
“Earliest Deadline First”: this strategy prioritises task based on their deadline and is used to ensure that critical tasks are not delayed by the scheduling. However it is rarely successful in achieving any optimisation of the resource utilisation. Furthermore, if most tasks are equally urgent, this strategy effectively ends up as a “First Come-First Served” strategy.
“Shortest Job First”: this strategy aims at maximizing throughput of tasks by scheduling the shortest tasks first. This is rarely an optimal scheduling strategy, both in terms of meeting deadlines and resource utilization, unless there is prior knowledge that short processes are likely to be the most urgent.
It is therefore an object of the present invention to provide a method and system for scheduling demands on a system which is able to operate dynamically during the operation of the system and which optimizes, and preferably improves, the use of the resources of the system.
It is an object of the present invention to provide a method and system for reactive scheduling of demands which ensures that deadlines are met.
It is an object of the present invention to provide a method and system for reactive scheduling of demands which demonstrates improved resource utilization compared to the benchmark of a “first come-first served” scheduling strategy.
It is preferable that the method and system of the present invention are simple and do not rely on global knowledge or historical data.