Data organization is important in relational database systems that deal with complex queries against large volumes of data. Relational database systems allow data to be stored in tables that are organized as both a set of columns and a set of rows. Standard commands are used to define the columns and rows of tables and data is subsequently entered in accordance with the defined structure. The defined table structure is logically maintained, but may not correspond to the physical organization of the data. For example, the data corresponding to a particular table may be split up among a number of physical hardware storage facilities.
Users in relational database systems require a minimum time possible for execution of complex queries against large amounts of data. Different physical types of storage, for example random access memory and hard drives, incur different length delays. In addition, writing to memory or a hard drive is often slower than reading an equivalent amount of data from memory or a hard drive. The organization of data corresponding to tables defined in a relational database system may determine the number of writes and reads that need to be performed in order to execute a common query. If the data is properly organized, in responding to queries performance can be improved by taking advantage of that organization and searching only part of the data. If the data is not organized in any way, it will often need to be searched in its entirety to satisfy a query or copied and restructured into a useful organization.
Given a particular change in the organization of data, particular types of searches or other operations performed on the data may be adversely impacted in terms of efficiency if they are performed without any adjustment. Many factors must be addressed to adjust a search that is to be performed with respect to a new organization of data. Such factors include, but are not limited to, the manner in which the data is stored, the file system that identifies the location of the data and various other information about the data, and the desired outcome of the search. Failure to consider and address any one of those factors can result in an inefficient search.
In general, in one aspect, the invention features a method for selecting rows from first and second tables each having rows containing values in columns. In at least the first table, the rows are divided into partitions at least one of which is populated by one or more rows. The method includes (a) defining a subset of the populated partitions of the first table that excludes at least one populated partition of the first table, (b) creating a file context, which stores at least location data for a row and a first value associated with the row, for each populated partition in the subset of the populated partitions of the first table, (c) determining the lowest first value stored by the file contexts for the first table, (d) identifying rows with a particular first value by at least reading the file contexts of the first table, and (e) repeating a through d until the subsets of the populated partitions of the first table have included all the populated partitions of the first table.
Implementations of the invention may include one or more of the following. Defining a subset may include calculating a total number of file contexts for both tables. The rows of the second table may be divided into partitions. The method may include (axe2x80x2) defining a subset of populated partitions of the second table, (bxe2x80x2) creating a file context, which stores at least location data for a row and a first value associated with the row, for each populated partition in the subset of the populated partitions of the second table, (cxe2x80x2) determining the lowest first value stored by the file contexts for the second table, (dxe2x80x2) identifying rows with a particular first value by at least reading the file contexts of the second table, and (f) repeating a through e and axe2x80x2 through dxe2x80x2 until the subsets of the populated partitions of the first table have included all the populated partitions of the first table, and where (e) may include repeating b through d and axe2x80x2 through dxe2x80x2 until the subsets of the populated partitions of the second table have included all the populated partitions of the second table.
Creating a file context may include changing the location data and first value to correspond to a row in a different partition. Rows may be stored in order of their corresponding first value within the partitions. The first value corresponding to a row may be the result of a hash function applied to the values in one or more columns.
Defining a subset of the populated partitions of the first table may include (i) representing a total read time for the first and second tables in terms of a variable representing the number of partitions in a subset of the partitions of the first table, (ii) determining the rate of change in total read time in terms of the number of partitions in a subset, (iii) truncating the number of partitions for which the rate of change in total read time is zero, and (iv) increasing the number of partitions to one if truncation results in a value of zero. Defining a subset of the populated partitions of the first table may further include (v) if truncation results in a value greater than zero, determining whether the read cost for a subset including an additional partition is less than the read cost for the current number of partitions and increasing the number by one if the read cost with the additional partition is lower.
A number, f1, of populated partitions in a subset of the partitions of the first table may be determined in accordance with the equation f1=(fT*R/(1+R)) where fT is a total number of file contexts for both tables and R={square root over ((r2/r1)*(p1/p2))}, where r1 and r2 represent a cost to read once through tables 1 and 2, respectively, and p1 and p2 represent the number of populated partitions of tables 1 and 2, respectively. Alternatively, R may be calculated using R={square root over ((db2/db1)*(p1/p2))}, where tables 1 and 2 require db1 and db2 data blocks of storage, respectively.
In general, in another aspect, the invention features a database system for iteratively selecting rows from a first table. The database system includes a second table. The first table includes rows and columns and is divided by rows into partitions. At least one of the partitions in the table is populated by one or more rows. The system includes one or more nodes, a plurality of CPUs, each of the one or more nodes providing access to one or more CPUs, and a plurality of processes, each of the one or more CPUs providing access to one or more virtual processes. Each process is configured to manage data, including the partitioned database table, stored in one of a plurality of data-storage facilities. A partitioned table access component is configured to select rows from at least the first table by (a) defining a subset of the populated partitions of the first table that excludes at least one populated partition of the first table, (b) creating a file context, which stores at least location data for a row and a first value associated with the row, for each populated partition in the subset of the populated partitions of the first table, (c) determining the lowest first value stored by the file contexts for the first table, (d) identifying rows with a particular first value by at least reading the file contexts of the first table, and (e) repeating (a) through (d) until the subsets of the populated partitions of the first table have included all the populated partitions of the first table.
In general, in another aspect, the invention features a computer program, stored in a tangible medium, for selecting rows from a first table. The first table has rows and columns and is divided by row into partitions. At least one of the partitions is populated by rows. The program includes executable instructions that cause a computer to (a) define a subset of the populated partitions of the first table that excludes at least one populated partition of the first table, (b) create a file context, which stores at least location data for a row and a first value associated with the row, for each populated partition in the subset of the populated partitions of the first table, (c) determine the lowest first value stored by the file contexts for the first table, (d) identify rows with a particular first value by at least reading the file contexts of the first table, and (e) repeating a through d until the subsets of the populated partitions of the first table have included all the populated partitions of the first table.
In general, in another aspect, the invention features a method for allocating a total number fT of file contexts between a number of file contexts f1 for a first table T1 having p1 populated partitions and a number of file contexts f2 for a second table T2 having p2 populated partitions. The method includes (a) calculating f1 to minimize the total read cost, (b) if f1 is not a whole number, truncating f1 to a whole number, (c) if f1 is zero, adding one to f1, (d) calculating f2 using the following equation: f2=fTxe2x88x92f1, (e) if f1 was truncated, determining whether the total read cost for f1 partitions of T1 and f2 partitions of T2 is greater than the total read cost for f1+1 partitions of T1 and f2xe2x88x921 partitions of T2, and if it is, adding 1 to f1 and subtracting 1 from f2 (f) if f1 greater than p1, setting f1 to p1 and setting f2 to the minimum of p2 and fTxe2x88x92p1, and (g) if f2 greater than p2, setting f2 to p2 and setting f1 to the minimum of p1 and fTxe2x88x92p2.
Implementations of the invention may include one or more of the following. The calculation of f1 may be done in accordance with the following equation f1=(fT*R/(1+R)), where R={square root over ((r2/r1)*(p1/p2))}, where r1 and r2 represent a cost to read once through tables T1 and T2, respectively. Alternatively, R may be calculated using R={square root over ((db2/db1)*(p1/p2))}, where tables 1 and 2 require db1 and db2 data blocks of storage, respectively.
Other features and advantages will become apparent from the description and claims that follow.