Databases typically store data in a number of data structures such as database tables. Retrieving data from such database tables involves querying the table to find the correct row and returning the correct data. Simply traversing the table rows until the correct one is identified, however, can be a rather time consuming and processor-intensive prospect. The average number of rows that would have to be searched to find the correct row of the table equals (n+1)/2 wherein n is the number of rows. Thus, for a table having, say, 100 rows, an average of 50.5 rows must be examined before the correct row is reached. In some cases many more than the average number of rows must be examined. For instance, in the extreme case, the maximum number of rows that must be examined in order to find the correct row is equal to the actual number of rows, n. Querying a table by stepping through each of the rows in order, therefore, results in unacceptable latency for tables that are frequently queried or for tables that are exceptionally large.
To combat this latency issue, large and/or frequently accessed tables can be indexed. Indexing is a way of copying certain information contained in the table and ordering it in such a way that facilitates faster searching of the table. While this comes at the cost of requiring additional data storage facility and additional write operations, it can greatly decrease the time required to find data in a database table. By way of example, a column of the table could be indexed by organizing the elements in each row in a known order (e.g., ascending or descending values) or in a particular data structure (e.g., a b-tree). Such indexing can greatly increase the speed with which the correct row is identified. For instance, for an ordered index, the average number of rows that must be examined is log2(n)−1 and the maximum number of rows that must be examined is log2(n) rows. Thus, by simply creating an index that orders the 100-row table discussed above and performing a binary search using the index has the effect the average number of rows that must be examined can be decreased significantly from 50.5 to log2(100)=6.64. Indexing is, therefore, be a powerful tool to reduce latency in database queries.
The creation of an index for database table or a column of a database table, however, is sometimes no small feat. For instance, in a large database (e.g., of the type frequently used in modern commercial applications), such an index can take many hours to create. To ensure that the index is accurate the table being indexed must typically be locked to writers during the creation of the index, which necessitates making the database or database table unavailable to some client applications or customers during that time. However, lengthy downtimes are less and less acceptable. Accordingly, it would be desirable to be able to create an index such that the time that the database table is unavailable is minimized.
In the drawings, like reference numbers generally indicate identical or similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.