In a typical relational database system, users store, update and retrieve information by interacting with user applications (xe2x80x9cclientsxe2x80x9d). The client responds to the user""s interaction by submitting commands to a database application responsible for maintaining the database (a xe2x80x9cdatabase serverxe2x80x9d). The database server responds to the commands by performing the specified actions on the database. To be correctly processed, the commands must comply with the database language that is supported by the database server. One popular database language is known as Structured Query Language (SQL).
Various access methods may be used to retrieve data from a database. The access methods used to retrieve data may significantly affect the speed of the retrieval and the amount of resources consumed during the retrieval process. Many information retrieval applications make use of indexes when performing content-based searches on the database data. Typically, such indexes are implemented using a heap-organized table and a B-tree index. The index is typically built on the columns of the table, each index entry in the index having the form  less than column_value(s), rowid greater than  comprising the columns values which make up the index key, and a rowid which uniquely identifies the row within the table that corresponds to the index entry.
Many applications involve data sets in which individual rows are identified by a primary key. The primary key uniquely identifies each row within the table and may be formed from a single column such as a social security number for employees or may be formed from multiple columns such as an area code plus a seven digit local phone number. For such a table, a primary key index can be used. Entries within the primary key index may have the form  less than primary key value(s), rowid greater than . A primary key index provides good query performance when only indexed columns are referenced in the query, so-called index-only scans. If non-indexed columns are referenced, an index-based scan may be used. In an index-based scan, the primary key index structure is traversed to obtain a rowid which is then used to access the rows within the table structure.
To speed up primary key-based retrieval efforts as well as to avoid duplication of the primary key values, an index-organized table has been introduced which includes not only the indexed column values, but also all the remaining column values of the table in the leaves of the index structure, thus rendering maintenance of a separate table unnecessary. Each row, therefore, includes both key and non-key columns, the non-key columns being stored along with the key columns. As a result, index-organized tables do not incur additional Input/Output (I/O) overhead to access the non-key columns as with a conventional table""s index based scan.
As with a conventional heap organized table/index pair, an index-organized table provides good query performance for data included in the column or columns forming the primary key. To aid in the retrieval efforts of queries based on non-primary key columns, secondary index structures may be provided. Entries within the secondary structure have the form  less than secondary key value, primary key value greater than . For index-only scans, data may be extracted directly from the secondary index structure. For index-based scans, the secondary index structure is first traversed to obtain the corresponding primary key value which is then used in a primary key index scan of the index-organized table to obtain values from other columns which are of interest.
As described above, secondary index-based scans require traversal of the primary key index structure to access the pertinent database row as opposed to a single additional I/O for conventional heap-organized tables. In conventional heap organized table/index pairs, an index entry includes a row identifier (rowid). Typically, the rowid is a physical address of the row, which uniquely identifies the row within the table that corresponds to the entry. As a result, once the entry in the index is identified, the rowid provides the location in the table which includes the data for the remaining columns. As a result, rowids provide the fastest possible access to a row in a given table.
Rows in index-organized tables do not have a permanent address as they are stored in the index leaf blocks in sort order and may move around in order to retain that order. As a result, secondary indexes including traditional rowids are not possible. As a substitute, a secondary index entry can include a logical rowid. The logical rowid may include the primary key value as well as a physical leaf block address. The leaf block address identifies a primary key index leaf block which most likely contains the row.
Accordingly, a particular method for using a secondary index to locate a row in an index-organized table includes traversing the secondary index to locate a secondary index entry that includes a target secondary key value. Each secondary index entry also includes a logical rowid element. In one embodiment, each logical rowid element has both a primary key value element, and a datablock address element identifying a leaf block in the index-organized table that most likely includes a row corresponding to the primary key value element.
The leaf block identified by the logical rowid value of the located secondary index entry is accessed in the index-organized table. The leaf block is scanned to locate the row in the index-organized table that includes the target secondary key value and corresponds with the located, secondary index entry.
A primary key scan is performed on the index-organized table if the target row in the index-organized table is not located in the identified block.
In addition, in at least one embodiment, statistics are gathered which represent an accuracy of the logical rowid value, by calculating an accuracy ratio for the logical rowid. The accuracy ratio represents a complement of a number of times a primary key scan is performed versus a number of times the leaf block in the index-organized table is accessed.
In processing a subsequent database query statement that references the target secondary key value, a determination is made, in response to the statistics, as to whether the database query statement will be serviced using the secondary index with or without a leaf block address, or whether the database query statement will be serviced by performing a primary key scan on the index-organized table.
In another embodiment of the present invention, an in-memory, large object locator in a database management system includes a large object identifier, and a logical rowid. The logical rowid contains a primary key and an address identifying a datablock that likely includes the large object.