The invention relates to searching binary tree indexes and in particular to estimating the number of keys in an index key range.
Binary radix trees are based on Knuth: The Art of Computer Programming, Volume 3, "Sorting and Searching", (pp. 471-480) (1973). They have been implemented in the IBM System/38. Their use in the System/38 is described in a book: IBM System/38 Technical Developments pp. 59-63 and 67-70 (1978), and in a System 38 VMC Logic Overviews and Component Descriptions Manual, Sec. 7.1: "Machine Indexes", and Sec. 1.1: "Database" (1978). Binary radix trees are related to B-trees and numerous other trees.
The vast majority of pages (blocks of data paged in and out of main storage) in a binary radix tree index lie at the lowest level of the tree. They are called leaf pages. Searching a range of keys in a binary radix tree index stored in a paging environment is very efficient provided that the pages to be searched are resident in fast access storage. If the pages are not resident, significant time is consumed to retrieve the leaf pages required. Determining the number of keys in a key range has required retrieving large quantities of leaf pages and thus has been time consuming.
Two operations which benefit from finding the quantity of keys within a limited range of key values are Query and Join in a relational data base management program. Knowing the number of keys in a range allows a user to select a more efficient order in which to perform the Query and Join operations.
One attempt at improving the speed of Join operations is indicated in U.S. Pat. No. 4,497,039 to Katakami et al. Katakami et al. describes a Join operation for generating a new table (data space) linking tuples (rows) of a plurality of pertinent tables based on a common field (column) or plurality of common fields. A minimum extraction range for determining the tuples to be processed is determined with respect to the Join field or the plurality of Join fields for each table which is considered as the object of the Join. While processing of unwanted data is avoided, one still touches all the pages in the range of interest to determine the size of the range in a given index. Thus, much time is spent retrieving leaf pages.