Multiprogramming is a computing technique operating on the basis that if a job is waiting for an I/O request to complete, the CPU can process another job during the wait, thereby increasing throughput of the number of jobs processed by the system. Virtual Memory (VM) can be combined with multiprogramming to enable even higher throughput, unfortunately creating the potential for a system to thrash, in which more time is spent replacing pages in physical memory and less time is available for the actual processing of the data pages. An optimal multiprogramming level allows a system to operate at maximum throughput level while avoiding both under-load and thrashing (over-load). The problem of operating a system at an optimal multiprogramming has been addressed using three basic prior techniques including a feed-forward approach, a feed-back approach, and a static MPL approach.
In the feed-forward approach, thrashing is acknowledged to be caused by over-allocation of memory. The feed-forward approach addresses memory allocation by estimating the amount of memory to be used by a job and only admit the job if the system has enough free memory to accommodate the estimated memory of the job. A problem with the feed-forward approach is necessity for an accurate estimate of the amount of memory a job uses. For example, the jobs of interest can be Business Intelligence (BI) queries on an Enterprise Data Warehouse. BI queries are typically very complex and accurately estimating the amount of memory required by a query is difficult.
The feed-back approach employs sampling of a selected performance metric and controlling MPL accordingly. If the performance metric exceeds a selected target value then the rate of admitting jobs into the system is reduced. If the performance metric is less than a selected minimum, then the rate of admitting jobs into the system is increased. Thus, the performance metric is maintained at an optimal rate by controlling the admission of jobs into the system. Examples of feed-back techniques can include adaptive control of conflict ratio, an analytic model using a fraction of blocked transactions as the performance metric, wait-depth limitation, and others. A difficulty with the feed-back approach is selection of sampling interval over which the performance metric is measured. If the sampling interval is too small, then the system may oscillate and become very unstable. If the sampling interval is too large, then the system may become very slow to react to a changing workload and thus not act sufficiently quickly to prevent overload and under-load behavior. Typical Business Intelligence workloads shift rapidly between small queries and huge queries. A performance metric and an associated sampling interval which is appropriate for one workload type may be unsuitable for a different kind of workload that occurs only seconds later on the system. Thus the feed-back loop approach is typically inappropriate for a rapidly changing BI workload.
In a static MPL approach, a selected typical workload is run multiple times through the system. Each run is performed at a different MPL setting and the corresponding throughput is measured. An optimal MPL is then chosen based on the trial and error experiments and based on guesswork. Several problems arise with the static MPL approach. First, performing the trial and error experiments is expensive and inaccurate. The resulting MPL set by the system may work marginally well for the workload used in the testing, but is unlikely to work well with other workloads. Furthermore, the static nature of the approach in inappropriate for handling a dynamic shift in the workload. The static MPL approach is often used despite the inadequacies due to relative simplicity of implementation.
A common use of an enterprise data warehouse is running a continuous stream of queries. The objective is to receive return results in the shortest possible time. The time duration for a continuous stream of database queries to run on a system depends, among other things, on the number of concurrent streams that are used to run the queries. The number is known as MPL (Multi Programming Level). If the MPL is too low, then the database system may be under-loaded such that the workload finishes sooner if the number of concurrent streams is increased. Hence, database users attempt to achieve a higher throughput (as measured in queries finished per unit time) by increasing the MPL. A drawback with the strategy is that if the MPL is too high then the database system may be overloaded and experiences severe memory contention and CPU thrashing. Thrashing results in severe performance deterioration. When a user first confronts a new workload, the correct MPL to run the workload is unknown and the user has to determine the MPL at which to execute the workload. At lower levels, increasing the MPL can lead to an increase in throughput. But as the MPL is increased, a danger arises of entering an overload region where even slightly higher than optimal MPLs result in a lower throughput.
The problem of managing MPL is further confounded since a typical Business Intelligence (BI) workload can fluctuate rapidly between long resource-intensive queries and short less-intensive queries. At each instant of time, the system can experience a different mix of queries and thus use a different optimal setting of MPL. Furthermore, as throughput is increased, very often increasing MPL by even one can result in severe performance deterioration rather than a gradual decline in performance.