A database available from Oracle Corporation is a relational database that may be installed on a computer (commonly called “server”) available from Sun Corporation, running the operating system Solaris. Relational means that the database is composed of various tables (with columns and rows) that can be connected to one another in many ways to view various relationships of the data. A server is a mid-sized computer where all the data are actually kept (it is also sometimes called the host). A server is coupled to various clients. Clients are typically Personal Computers (or “PCs”) that pull data off the server in ways that are useful to the user (i.e., for display and interpretation).
A server typically executes a query supplied by a client, and in executing the query applies certain operators (as required by the query) to data in the database. Examples of operators include sort, hash-join and bitmap merge. Each operator requires some space in memory. Portions of memory which Oracle uses to hold data to which an operator is being applied are called work areas. For example, a sort operator uses a work area (a.k.a. the sort area) to perform an in-memory sort of a set of rows. Similarly, a hash-join operator uses a work area (a.k.a. the hash area) to build a hash table from its left (a.k.a. build) input.
Generally, larger work areas can significantly improve the performance of a particular operator at the cost of higher memory consumption. Ideally, the size of a work area is big enough such that it can accommodate the entire input data and auxiliary memory structures allocated by the operator. This is referred to as the “optimal” size of a work area. When the size of the work area is smaller than optimal, the response time increases since an extra pass is performed over all or part of the input data. This is referred to as the “one-pass” size of the work area. When the work area is less than the one-pass threshold, multiple passes over the input data are needed, causing dramatic increase of the operator response time. This is referred to as the “multi-pass” size of the work area. For example, a serial sort operation which needs to sort 10 GB of data needs a little bit more than 10 GB to run optimal and at least 40 MB to run one-pass. If this sort gets less than 40 MB, it will have to perform several passes over the input data.
The goal for a Database Administrator (DBA) is generally to have most work areas running with an optimal size, e.g. more than 90% or even 100% for On-Line Transaction Processing (OLTP) systems, while a smaller fraction of them are running with a one-pass size (e.g. less than 10%). Multi-pass execution should be avoided by the DBA. Even for a Decision Support Systems (DSS) running large sorts and hash-joins, the memory requirement for one-pass executions is relatively small.
In Oracle8i, the maximum size of work areas is externally and statically controlled by the DBA, by setting the following parameters: SORT_AREA_SIZE, HASH_AREA_SIZE, BITMAP_MERGE_AREA_SIZE and CREATE_BITMAP_AREA_SIZE. There is one such parameter per operator that needs a work area.
Setting these parameters can be difficult for a DBA since the maximum work area size is ideally selected based on the data input size and the total number of work areas active in the system. These two factors vary a lot from one work area to the other and from one point in time to another. Thus, the various “*_area_size” parameters are hard to tune under the best of circumstances. For more information on database parameters and their effect on behavior of various operators, see the book entitled “Oracle8i Concepts” available from Oracle Corporation, and on the Internet at http://oradoc.photo.net/ora81/DOC/server.815/a67781/toc.htm (which book is incorporated by reference herein in its entirety).
An MS Thesis by K. L. Beck, entitled “Memory Allocation in Adaptive External Sorting”, Oregon Graduate Institute, October 1993 proposes a combination of “adaptive algorithms” that are capable of modifying their behavior to compensate for varying amounts of memory available to them during their lifetime, and a “central agent” to partition memory among concurrent queries. Beck calls this central agent a “memory broker,” and proposes a Brokered Adaptive Sorting (BAS) algorithm. In a simple example of the increased performance that BAS can provide, Beck describes two sorts running concurrently, one sorting much more data than the other. If the large sort uses the same amount of memory for its entire duration, it must choose between leaving memory unused for the smaller sort, or crowding the smaller sort out of memory. Beck's solution is to have the large sort use all, or nearly all, of memory when it is available, and to temporarily give up memory during the time that the smaller sort is executing. Beck states that such sorts adjust their memory allocation by checking-in with the memory broker at points when they could change their memory allocation without too much cost. According to Beck, the interactions between queries and the memory broker occurs at “check points” which are points where a query can conveniently change its memory usage. Beck's thesis is incorporated by reference herein in its entirety.
An article entitled “An Adaptive Hash Join Algorithm for Multi-user Environments” by Hansjörg Zeller and Jim Gray, VLDB 1990: 186-197, introduces “a modified, adaptive hash join method that is designed to work with dynamic changes in the amount of available memory. The general idea of the algorithm is to regulate resource usage of a hash join in a way that allows it to run concurrently with other applications. The algorithm provides good performance for a broad range of problem sizes, allows to join large tables in a small main memory, and uses advanced I/O controllers with track-size I/O transfers.”
Another article entitled “Dynamic Memory Allocation for Multiple-Query Workloads” by Manish Mehta and David J. DeWitt, VLDB 1993: 354-367, “studies the problem of memory allocation and scheduling in a multiple query workload with widely varying resource requirements. Several memory allocation and scheduling schemes are presented and their performance is compared using a detailed simulation study. The results demonstrate the inadequacies of static schemes with fixed scheduling and memory allocation policies. A dynamic adaptive scheme which integrates scheduling and memory allocation is developed and is shown to perform effectively under widely varying workloads.”
The article by Mehta and DeWitt also describes three schemes for allocating memory: “Minimum (allocating the minimum memory required to process the join, namely the square root of the size of the inner relation), Maximum (allocating enough memory so that the join's temporary relation resulting from the selection fits in memory), and Available (allocating whatever memory is available subject to the constraint that the amount of memory allocated is at least the minimum required by the query (i.e. the square root of the size of the inner relation) and no more than the maximum.”
See also U.S. Pat. No. 5,799,210 that is incorporated herein by reference in its entirety. This patent discusses in detail a process of “allocating buffer memory for database sort operations” (abstract).