The background description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
All publications herein are incorporated by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference. Where a definition or use of a term in an incorporated reference is inconsistent or contrary to the definition of that term provided herein, the definition of that term provided herein applies and the definition of that term in the reference does not apply.
As storage databases grow to ever-increasing sizes, processing and analyzing such data consumes ever-more system resources, such as processor-time, storage, and bandwidth. Reducing the processing time to perform data analysis even by a few milliseconds can mean the difference between a superior and an inferior product, particularly when data is processed a few billion times per day among a large set of clustered nodes. Database analytics, in particular, consume a large amount of computer resources, particularly when performing multiple analytics upon databases greater than a terabyte.
U.S. Pat. No. 5,842,207 to Fujiwara teaches a sorting method used with a distributed database whose key values is divided into a plurality of sections, which are then assigned to a plurality of processors. Fujiwara's system sorts key values independently of one another. The sorting results are distributively stored in the plurality of processors, and only the information that represents the correspondence between the sections of the key values is transferred to the host processor—eliminating the need for carrying out the merge processing in the host processor. Fujiwara's system, however, needs to repartition and resort the distributed database after every update of the database records.
U.S. Pat. No. 8,959,094 to Taylor teaches a database machine with specialized hardware that accelerates the sort function. Taylor's system provides for an early return of a number of results, which is useful because often an entire result set is not required. For example, for systems that only require the first L results, the system could be configured to return the first K results, where L⇐K. Taylor's system, however, fails to provide ways to accurately identify which L results are the important, pertinent result-sets needed.
Thus, there remains a need for a system and method that improves performing multiple analytics upon large databases.