Operating systems employ resource management schemes to limit the interference between applications, implement policies that prioritize and otherwise control resource allocations, and generally manage the overall behavior of a system that is running many independent software applications.
Existing resource management schemes are largely first-come, first-served. Counter-based resource management schemes, such as those used in the Digital VAX/VMS, the BSD/UNIX, and the WINDOWS NT operating systems, attempt to maintain an absolute count of resource use by one or more processes. Counters may track, for example, kernel memory utilization, Central Processor Unit (CPU) utilization, or Input/Output (I/O) data transfers.
One of the problems with counter-based resource management schemes is determining what the limits are and the consequences of reaching or exceeding the limits. More often than not, the limit is simply raised when it is reached. In the context of the WINDOWS operating system, setting limits on the use of certain resources is generally achieved through mechanisms such as job objects, kernel quotas, CPU affinities, and various ad hoc resource-specific limits. Resource use can also be capped along functional lines, as, for example, when the memory manager caps the use of kernel virtual address space based on how it is to be used. Another example is when the use of kernel pool by the Transmission Control Protocol/Internet Protocol (TCP/IP) is dynamically capped based on the type of packet that is currently being transmitted, e.g., a packet representing voice data might have a higher cap on the use of kernel pool than a packet that does not in order assure a quality voice transmission.
In some cases, resource management schemes are based on setting relative priorities of the processes competing for the resources to aid in arbitrating resource contention, as is currently done, for example, in the scheduling of CPU resources. In addition, resource management schemes may be based on privileges, i.e., requiring processes to have privileges to carry out certain operations to effect the allocation of resources, as is currently done, for example, by requiring a process to have the privilege to lock physical pages in memory.
There are several problems with existing resource management schemes. As most resources are system-wide, managing resources on a first-come/first-served basis can lead to denial of service problems. This is because resources may be subject to unbounded consumption by other applications, other users, or network-visible services. Reliance on the existing mechanisms creates an unpredictable environment in which applications often cannot acquire the resources needed to run because errant, selfish, or malicious applications have already absconded with them. The problem is particularly acute in large terminal services machines.
Priority-based resource management schemes only worsen the competition. Since applications cannot independently establish their priority relative to other applications, it is generally not possible to set priorities to share the resource fairly. In most cases, it is not even possible to define priorities fairly. In the case of CPU resources, this often leads to applications artificially boosting their priorities to ensure access regardless of the demands present elsewhere. The end result is that applications will compete at the inflated priority level, nullifying any fairness policy the priority scheme was aiming to accomplish.
With no limits on resource competition, it is very difficult to provide pre-defined levels of service to specific applications. An administrator or service provider generally cannot specify either a minimum or maximum amount of resources for an application. This presents problems in server consolidation scenarios, and forces the administrators and service providers to support consolidation by, for example, dynamically adjusting priorities to manage CPU utilization by specific applications.
Some systems have attempted to overcome some of the problems inherent in resource management through the use of resource guarantees. Instead of just setting limits or priorities, applications may contract for implicitly allocated resources upfront. Guarantees eliminate instantaneous competition for resources by adding a layer of indirection between requesting a resource and actually using it. By explicitly reserving resources in a first-come/first served manner, a client obtains a contract regarding future use of the resource (e.g., guaranteed I/O latency), regardless of any other outstanding guarantees. Bandwidth is one example where resource guarantees are particularly important for the implementation of multimedia applications. However, guarantees themselves are resources and allocation of guarantees may fail.
As personal computers move into the living room and take on many new roles, resource management becomes more important, particularly when managing conflicts in resource usage. In addition, server computers need to manage resources more effectively in order to provide a more predictable operational environment.