1. Field of the Invention
This invention relates generally to multi-set lookup systems and methods and more particularly to a system and method for determining a k order statistic across multiple sorted sets.
2. Description of the Related Art
Many applications require finding a median of a number of values, or a value associated with a particular rank in a union of sets, as sorting is a common task in computer applications. For example, finding a median with respect to image processing allows for images to be smoothed should a pixel value be in error. Additionally, in regard to enterprise databases it may be desirable to find a median or a k order statistic associated with the data stored in the database. Linear time algorithms exist for finding a k order statistic in an unsorted set. For example, quicksort uses a divide and conquer approach for finding a median value, dividing the total array in two parts, then choosing the appropriate part and recursively dividing it into two parts, and so on. Here, a random element in the set is chosen and the set is split into elements larger and elements smaller than the chosen element. Eventually, just one value remains.
Linear time algorithms are the fastest in an unsorted domain search because each element has to be considered. However, where the search domain consists of fully sorted subsets, linear time algorithms still consider each element. The consideration of each element in a fully sorted subset is inefficient.
FIG. 1 illustrates a merging technique applied to two sorted subsets to determine a location of a desired value. Here, sorted set A 100 and sorted set B 102 are merged to form sorted set 104 which is the union of sorted set A and sorted set B. If the 11th smallest value is desired, then the 11th smallest value of merged, sorted set 104 is the value of 20 in cell 106. However, the merging technique still requires a linear number of comparisons between elements of sorted sets A 100 and B 102. The advantage is that the result is a fully sorted set, allowing for consequent searches to be done in constant time. However, maintaining a merged set of the two ordered subsets consumes memory resources.
As a result, there is a need to solve the problems of the prior art to provide a method and apparatus for efficiently determining a k order statistic in a union of a plurality of sorted sets without merging the ordered subsets.