A database server allocates logical database space for data in a database. According to one approach, the units of database space allocation are data blocks, extents, and segments. At the finest level of granularity, a database server stores data in data blocks. One data block corresponds to a specific number of bytes of physical database space on disk. The next level of logical database space is an extent. An extent is a specific number of contiguous data blocks allocated for storing a specific type of information. The level of logical database storage above an extent is called a segment. A segment is a set of extents, each of which has been allocated for a specific data structure. For example, each database table's data is stored in the database table's own segment.
At any given moment, some of the data blocks in a segment may contain data, and some of the data blocks in the segment may be empty. Some data blocks may be partially, but not completely, full. The empty portions of data blocks within a segment comprise the free space of the segment. When a database server inserts data into a segment, the database server inserts the data into the segment's free space. The segment's free space is reduced accordingly.
According to one approach, a database server consults a segment's free space list to determine which data blocks in the segment are at least partially empty. The segment's free space list is a singly linked list. Pointers to the head and tail of the segment's free space list are stored in the segment's first block, which is called the segment header block. Thus, the segment header block may be viewed as containing metadata that describes where a segment's free space is located.
Multiple database server processes might attempt to insert data into the same segment simultaneously or almost simultaneously. When this occurs, each database server process contends for access to the segment's single segment header block. Because the segment header block contains the metadata described above, contention for the segment header block may be called “metadata contention.” As a result of this metadata contention, the segment header block becomes a “hot spot.” Compounding the problem, each database server process follows the same free space list links to data blocks in the segment's free space. As a result, the data blocks also may become hot spots. The contention for the segment header block and data blocks may significantly degrade throughput.
To alleviate contention at the data blocks, one approach implements multiple different free space lists per segment instead of a single free space list per a segment. While each of the free space lists similarly indicates all of the free space in the segment, each of the free space lists has a different head and tail. When a database server process attempts to insert data into a segment, the database server process' identifier is input into a hash function to produce a hash value. A particular free space list that corresponds to the hash value is selected from among the several free space lists. Thus, different database server processes are directed to different empty or partially empty data blocks in a balanced manner, thereby reducing contention at the data blocks. Unfortunately, because the different free space list heads and tails are contained in the same segment header block, contention at the segment header block remains.
To further complicate matters, a single database may be shared by multiple instances of a database server, such as in Oracle Corporation's Real Application Cluster (“RAC”). In such a configuration, each separate database server instance reads data from and writes data to the same disk space, but each database server instance maintains its own separate shared memory. A particular database server instance may comprise multiple database server processes, and each database server process of a particular database server instance may share the particular database server instance's shared memory. However, database server processes of one database server instance do not share the shared memory of another database server instance.
When a first database server instance needs to allocate disk space within a segment, the first database server instance first loads the segment's segment header block into a buffer cache in the first database server instance's shared memory. If a second database server instance needs to allocate disk space within the same segment, then the segment's segment header block first needs to be transferred from the buffer cache in the first database server instance's shared memory into a buffer cache in the second database server instance's shared memory. Such a transfer requires significant overhead, and may significantly degrade throughput, especially if the first and second database server instances execute on separate machines.
To avoid such transfers, one approach partitions each segment into multiple segment partitions; one for each database server instance. Each database server instance is associated with a different one of the segment partitions. Segment partitions are not shared between database server instances. Each segment partition is associated with its own separate free list group block. A free list group block contains metadata that indicates the location of the free disk space within the segment partition that is associated with the free list group block. Similar to a segment header block, a free list group block indicates a head and a tail of a linked list. Because only one database server instance accesses a particular free list group block, metadata contention between database server instances is reduced.
However, a significant drawback attends the approach just described. A first database server instance cannot allocate disk space within a second database server instance's segment partition. This is so even if the first database server instance's segment partition lacks sufficient free space and the second database server instance's segment partition contains abundant free space. The associations between database server instances and their segment partitions are static. It is difficult, if not impossible, to predetermine how large a particular database server instance's segment partition should be in relation to other segment partitions. Partition sharing is not permitted in the approach just described. As a result, significant amounts of disk space may be wasted.
These are some of the problems that attend past approaches to the management of disk space by multiple database server instances in a cluster configuration. Because of these problems, past approaches to the management of disk space by multiple database server instances in a cluster configuration leave much to be desired. Approaches that seek to reduce the waste of disk space do so at the cost of increased metadata contention. Approaches that seek to reduce metadata contention do so at the cost of wasted disk space. A disk space management technique that overcomes both of the problems of metadata contention and disk space waste is needed.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A method and apparatus for the dynamic management of disk space by multiple database server instances in a cluster configuration is disclosed. According to one embodiment of the invention, a determination is made as to whether at least one bitmap block satisfies a first set of criteria. If at least one bitmap block satisfies the first set of criteria, then, based on information that is indicated in that bitmap block, disk space is allocated for use by a first server instance. Alternatively, if no bitmap block satisfies the first set of criteria, then a determination is made as to whether at least one bitmap block satisfies a second set of criteria. If at least one bitmap block satisfies the second set of criteria, then that bitmap block, which is associated with a second server instance, is associated with the first server instance instead, and, based on information that is indicated in that bitmap block, disk space is allocated for use by the first server instance.