Data stored on a computer system is typically arranged into one or more data storage spaces. Collectively, these data storage spaces are often referred to as a database. Each data storage space comprises one or more data items. Data items in a particular space share common characteristics. A data item may have a value for each of these characteristics. For example, relational databases store data in a number of spaces known as tables. The data items in each table, forming the “rows” of the table, share the same “columns” of data, in that for each column of data, any item in the table may have a value.
It is helpful to partition data storage spaces for administrative purposes such as archiving, caching, enhancing performance, copying or deleting data, and free space management. For example, spaces that store data items with date characteristics are often partitioned so that each partition comprises only those items that pertain to a particular range of dates. One partition, for example, might only store data items pertaining to a particular month. Another partition might only store data items that pertain to a particular fiscal quarter. Since a partition also comprises data items that share common characteristics, a partition may also be considered a data storage space.
The determination of to which, if any, partition a data item pertains is made by looking up one or more of the data item's values in a partition mapping. The partition mapping maps certain sets or ranges of values to certain partitions. These sets or ranges of values correspond to one or more characteristics shared by the data items in the partitioned storage space. These characteristics are known as partitioning characteristics. For tables, these sets or ranges of values may correspond to the value of a particular column upon which the partition mapping is said to be based. This column is known as the partitioning column. For example, a partition mapping for a table might be based upon the value of a data item's date column. The mapping could define ranges of dates, such as months or years. Each range could pertain to a separate partition. To determine the partition to which a new data item pertains, one would determine under which of the defined ranges the value of the item's date column fell. Partition mappings may be based on multiple partitioning characteristics, which is often the case with subpartitions. Partition mappings may also be based on a variety of other characteristics, such as whether a data item pertains to a particular range of numbers or set of discrete values.
Spaces may also comprise “catch-all” partitions, which contain data items that do not pertain to any other partition found within the space. Since existing mechanisms for partitioning spaces require that each partition be created manually, a catch-all partition is helpful when it is necessary to store a data item whose values were not contemplated in the partitioning scheme. For example, if a space consisted of partitions for each week of 2007, but a data item pertaining to 2008 were received, the data item would be stored in the catch-all partition. While catch-all partitions are useful fail-safe mechanisms, they are typically no more helpful for administrative purposes than unpartitioned spaces.
Existing methods for partitioning data storage spaces rely on mechanisms external to the database system to create new partitions when necessary. One such mechanism is for a database administrator to manually create new partitions immediately before they are required. This mechanism is problematic in that it requires that the administrator know when new partitions will be required—knowledge which may not always be available to the administrator. The mechanism is also inefficient in that the administrator must either remember to create a new partition each time a new partition is required or deploy a utility (or script) to periodically create the partition for the administrator. Deploying and troubleshooting such a utility incurs additional time and monetary costs for the administrator.
Alternatively, an administrator may simply create all the partitions that he or she anticipates will be needed long before they are ever needed. Again, this is problematic in that it requires the administrator to know what partitions will be necessary. It is also inefficient in that it wastes resources with empty partitions that may not be used for a very long time.
A simpler and more efficient mechanism for partitioning tables is desirable in order to facilitate the more widespread use of partitioned spaces.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.