Databases are used to store information for an innumerable number of applications, including various commercial, industrial, technical, scientific and educational applications. Many databases are relational databases, which organize information into formally-defined tables consisting of rows (i.e., records) and columns (i.e., fields), and which are typically accessed using a standardized language such as Structured Query Language (SQL). Database Management Systems (DBMS's) are the computer programs that are used to access the information stored in the databases and to process searches, or queries, against the databases.
In general, a database query references one or more database tables in a database and includes one or more predicates. Each predicate includes an expression that references a column of a database table and a key value within the column of the database table and an operator (e.g., EQUAL, etc.). Other operators (e.g., AND, etc.) may also be applied to multiple predicates. To execute the query, many DBMS's perform query optimization, in which multiple execution plans or access plans for satisfying the database query are examined to determine the most efficient way to execute the query.
One type of optimization that may be utilized in an access plan includes the use of a database index. Just as an index in a book facilitates locating information on a specific topic quickly and without blindly paging through the book, database indexes provide similar benefits by providing a method to quickly locate data of interest in a database. In particular, the shorter entries of an index may be quickly searched for data (e.g., a key value) that satisfies the database query, and then the corresponding locations in the database table may be searched to retrieve the results. Thus, only a subset of the database table is searched. Without an index, a DBMS performs a full table scan of the database table, blindly searching through every row in the database table until the target data is located. Depending upon where the data resides in the database table, such a table scan can be a lengthy and inefficient process.
A database index is usually built over one or more columns of the database table with specific key values selected for inclusion in the index. The DBMS will then create an index where each specified key value has an entry in the index with a pointer from each key value to its corresponding record in the database table. In many cases indexes include sufficient information about which particular records in a database table likely match a particular predicate without having to retrieve and scan all of the individual records of the database table, thus saving significant time.
One area in which the use of indexes can become complicated is in connection with textual data, and in particular textual data that is case sensitive. When textural information is stored as a key value in a column of a database table it is represented by a binary code that is recognizable and usable by a computing system. There are many different ways to represent information via a binary code otherwise known as character encoding. One of the earliest examples is the Extended Binary Coded Decimal Interchange Code (EBCDIC) based on an 8-bit binary code in which the lowercase English letter “f” is represented by the binary value 10000110 (hexadecimal 86) and the uppercase English letter “F” is represented by the binary value 11000110 (hexadecimal C6). Another early example is the American Standard Code for Information Interchange (ASCII) based on a 7-bit binary code in which the lowercase English letter “f” is represented by the binary value 1100110 (hexadecimal 66) and the uppercase English letter “F” is represented the binary value 1000110 (hexadecimal 46). Other character encoding methods include Unicode and its' variations. In the event a database query is executed and the information to be analyzed may be either a lowercase or an uppercase letter (e.g. denoting a female with either an “f” or an “F”) a database query looking for records containing females may be more efficient if it can recognize either as satisfying the query. This creates a situation where both the lowercase “f” and the uppercase “F” are to be treated the same or given the same weight and are generally referred to as shared weight attributes. Where an index is built over one or more columns of a database table, and there is a desire for either or both of a lowercase and uppercase key value to answer a database query, the index is known as shared weight index. A corresponding sort sequence table is created from the shared weight index that effectively maps the lowercase values to the corresponding uppercase values, or vice versa, allowing both lowercase and uppercase information to have the same weight.
There are different ways a database query may sort through the information contained in the key values. One particular sorting method is based on the information's specific binary or hexadecimal value, referred to as a hexadecimal sort sequence. In the event a database query is executed requiring a hexadecimal sort sequence a shared weight index typically may not be used because an incorrect record could be returned. For example, in a system using EBCDIC where the hexadecimal value is C6 for an uppercase “F” and 86 for a lowercase “f” and a shared index sets them equal to each other (“f”=C6), a database query for records containing only an uppercase “F” utilizing a hexadecimal sort sequence would also return records containing a lowercase “f,” which is an incorrect return. Therefore, an additional, non-shared weight index may be required to enable query optimization.
However, as the reliance on information increases, both the volume of information stored in most databases, as well as the number of users wishing to access that information, likewise increases. Moreover, as the volume of information in a database, and the number of users wishing to access the database increases, the amount of computing resources required to manage such a database increases as well. Thus, new ways to use database indexes are needed in order to continue to provide significant improvements in query performance, since database indexes often require many system resources to build. Otherwise, database users may be hampered in their ability to maximize intelligent information retrieval.
Consequently, there is a need in the art for reuse and alternative uses of existing database indexes.