The present invention relates generally to data fragmentation (partitioning) in database systems, and more particularly to methods and apparatus for creating and using what will be called correlated fragmentation in relational databases.
A database is a collection of information. A relational database is a database that is perceived by its users as a collection of tables. Each table arranges items and attributes of the items in rows and columns, respectively. Each table row corresponds to an item (also referred to as a record or tuple), and each table column corresponds to an attribute of the item (referred to as a field or, more correctly, as an attribute type or field type). A key is a set of one or more columns of a record from a table wherein those column value(s) can be used to identify one or more records in a table.
A “primary key” for a table is a simple or composite attribute that uniquely identifies rows of the table. A key must be inherently unique, and not merely unique at a particular point in time. It is possible, but not typical, to have a table where the only unique identifier is the composite attribute consisting of all the attributes of the table. It is also possible, but not typical, for a table to have more than one unique identifier. In such a case, the table may be said to have multiple candidate keys, one of which could be chosen and designated as the primary key; the remaining candidates then would be said to be “alternate keys”. Collectively, primary and alternate keys may be referred to as “unique” keys. Primary and alternate keys must satisfy two time-independent properties. First, at no time may two rows of the table have the same value for the key. And second, if the key is composite, then no component of the key can be eliminated without destroying the uniqueness property
A “foreign key” is a possibly composite attribute of a table whose values are required to match those of the primary key of some table, which is typically but not necessarily distinct from the table for which the foreign key is defined. A foreign key value represents a reference to the row containing the matching primary key value, which may be called the referenced row or the target row. The table that contains the foreign key may be
A “referential integrity” rule requires a database system to maintain “referential integrity” between foreign keys and primary keys. The rule states that a database must not contain any unmatched foreign key values. An unmatched foreign key value is a non-null foreign key value for which there does not exist a matching value of the primary key in the relevant target table. Note that referential integrity requires foreign keys to match primary keys. A database system is not required to maintain integrity with respect to alternate keys.
Fragmentation is a technique used to increase database performance. Horizontal fragmentation breaks a table into horizontal pieces (groups of rows) called fragments. The rules assigning a row to a fragment are defined by a database user or administrator and are part of a “fragmentation scheme”. It is possible for a fragment of a given table to be empty if none of the rows of the table satisfy the fragmentation scheme's assignment rules for that fragment.
A “scan” is a process of reading one or more fragments of a table. A scan is accomplished by the actions of one or more scan operators. Fragments may be stored independently on separate disks or on separate nodes in a computer cluster or network architecture. Logically, all fragments may be scanned simultaneously, thereby increasing the overall rate at which the complete table can be read, subject to the limitations of the physical storage of the data. A scan operator is able to read more than one fragment, but not simultaneously. To read more than one fragment simultaneously, a database system must use multiple scan operators.
“Fragment elimination” is a process by which a database system identifies fragments from a table that cannot participate in the result of a query and removes those fragments from consideration in processing the query.