The field of the present invention is automated sorting and selection processes. More particularly, the present invention relates to a process for generating and using a binary search tree.
A binary search is a technique for locating an item in an organized arrangement. Typically, items in the arrangement are organized according to the values of the items, for example, in a sequential list. In a binary search, the search constraint, or key, is compared to the value of one of the items in the arrangement. If a larger value is desired, then the search may continue in the portion of the arrangement having larger values, and if a smaller value is desired, then the search may continue in the portion of the arrangement having smaller values. In this way, each comparison reduces the number of potential matches, until the desired value is found.
A binary search tree is a particularly useful arrangement for organizing items. In generating the tree, the value of each item becomes a “node” on the binary search tree. A “node” is a decision point on a tree, and is set at the value of one of the items. Nodes may have up to two branches, with one branch receiving values that are less than the node value, and the other branch receiving values that are larger than the node value. The first value added to a branch becomes a node on that branch. As additional values are added, the branch grows by adding additional nodes and branches, and thereby becomes a subtree. By convention, a node with one or two branches may be referred to as a “parent” node, and the value added to each branch may be referred to as a “child” node. A node with no branches may be referred to as a leaf, and represents the terminal node on a particular subtree.
A binary search tree is arranged by first selecting a root value. Preferably the root value is about in the middle of the range of expected values. The root value becomes the first node on the tree, and also represents the first comparison value for a search. As values are added to the tree, values that are larger than the root value become nodes on a subtree extending from one branch of the root value, while values that are smaller than the root value become nodes on a subtree extending from a second branch of the root value. In a similar manner, each node (except a leaf node), enables a decision point representing one “greater than” path and one “less than” path.
Once the tree has been arranged, a search routine is able to efficiently locate a desired value by making a series of simple comparisons at each of the nodes. If the tree has been typically arranged, each comparison will eliminate about half of the remaining items from consideration. After the search routine has located the desired value, that value is no longer available and is removed from the tree. Accordingly, the tree is rearranged to reflect the removal of the node. In a similar manner, the tree is rearranged as new values are added to the set of available items.
The usefulness and benefits of the binary search tree increase as the number of elements in the tree grows. Typically, in a well-balanced tree, the usual number of comparisons (“m”) is proportional to the equation “logbase 2(n)”, where “n” is the number of elements to be arranged. This relationship can be more particularly described by the equation:m=k*logbase—2(n)where “k” is a proportionality constant. Since “k” is a constant, its value does not affect the proportional increase in search time as the number of nodes is increased. Accordingly, for purposes of explanation and simplicity, the value of “k” can be assumed to be “1”. The efficiency of locating a particular element increases logarithmically according to the total number of elements. Take for example a well balanced binary search tree having 256 (log2=8) elements. The typical search would include about 8 comparisons, with each comparison representing one level in the height of the tree. If the number of elements is increased to 1024 (log=10), then the number of comparisons only increase to about 10. As illustrated, the number of comparisons and the logarithm of the elements only increased by 2 when the elements increased by a factor of 4 from 256 to 1024. Since the search time will be directly related to the number of comparisons, search time is also proportional to “log n”. It will be understood that in the illustrative examples above, “n” was selected for simplicity to have a whole number log2 result, and that “n” may be otherwise provided.
The binary search tree efficiently arranges values, provided each value is unique. However, the known binary search tree is quite inefficient in accounting for duplicate values. Since each node of the tree is configured only to accommodate a “greater than” or “less than” comparison, a more complex process is used to account for duplicate values. For example, known binary search trees may provide pointers that direct the search routine away from the search tree for duplicate values.
Many common applications routinely generate duplicate vales, and therefore do not generally benefit from the use of a binary search tree arrangement. If a binary search tree is used, the tree arrangement causes undesirable overhead or inefficiencies. For example, a memory manager generally implements a search routine for locating available memory blocks at a desired minimum size. More particularly, the memory search routine attempts to find the smallest available memory block that will accommodate a memory allocation request. In one example, each memory block is defined to be 256 bytes long. In this way, an available block of 2 kb would have a value of 8 blocks, while an available block of 25 kb would have a value of 100 blocks. If the memory manager receives an allocation request for 1.9 kb, then the memory manager would preferably find a memory block with a value of 8, and if 8 is not available, then the smallest number greater than 8.
However, in the typical implementation, it is highly likely that several blocks of value 8 could be available. The search arrangement and routine must account for such duplication, and if a binary search tree is used, provide additional processor and memory resources to handle duplicated values. As memory is released and is returned to the available memory pool, it is also highly likely that the size of released blocks will be duplicative of sizes already in the pool. In practice, there are often several blocks of a particular size, and several sizes with duplications. In this way, the process used to update the binary search tree also must account for duplicative values. With such substantial duplication, the binary search tree, if used in a memory manager application, typically adds an undesirable level of overhead and inefficiency.