Various embodiments of this disclosure relate to fast selection and, more particularly, to selecting a subset of data using a hardware or software implementation.
There are many circumstances in data manipulation where one might want to select a top subset or a bottom subset of a particular set of data. Typical approaches to this problem are based on adaptations of sorting algorithms. For example, the data can be iteratively divided at its median, until the top data elements of the desired quantity have been arranged together. Alternatively, instead of dividing the data based on a median value, a random value may be chosen, which avoids the time taken to locate the median but provides less predictable divisions. At worst case, these methods can be performed, respectively, in order-N time and order-N2 time.
Algorithms that run in under order-N time generally require storing the data in binary trees or heaps, which is often impractical, and the resulting sub-order-N time discounts the upfront cost of composing the data structures needed. These upfront costs can be prohibitive with large data sets.
The above approaches are generally not well suited to hardware implementations, and they make no effort to accommodate the limitations of modern processor memory architectures.