1. Field of the Invention
The present invention relates generally to improved systems and methods for sorting data. More particularly, the present invention relates to systems and methods for sorting data objects, wherein references to the objects are sorted rather than the objects themselves to reduce the time required for copying and swapping data during the sorting process. Further, a divide-and-conquer sorting method that includes multiple pivot elements is provided.
2. Relevant Background
Perhaps one of the most fundamental tasks to the operation of computers is sorting, i.e., the process of arranging a set of similar information into a desired order. While employed in virtually all database programs, sorting routines or algorithms are also extensively used in many other areas. Common examples include compilers, interpreters, and operating system software. In many instances, the quality and performance of such software is determined by the efficiency of its sorting techniques. Since sorting methodology often plays such an important role in the operation of computers and other data processing systems, there has been a great deal of interest in seeking ways to improve existing systems and methods.
To analyze a sorting algorithm, the amount of resources (such as time and storage) necessary to execute it is examined. Most algorithms are designed to work with inputs of arbitrary length. Usually the efficiency or complexity of a sorting algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity). Generally, sorting algorithm analysis is an important part of a broader computational complexity theory, which provides theoretical estimates for the resources needed by any algorithm, which solves a given computational problem. These estimates provide an insight into reasonable directions of research for efficient algorithms.
In theoretical analysis of algorithms, it is common to estimate their complexity in an asymptotic sense, i.e., to estimate the complexity function for reasonably large length of input. The notation for this analysis is generally referred to as “Big O notation.” For instance, a binary search is said to run an amount of steps proportional to a logarithm, or in O(log(n)), colloquially “in logarithmic time.” Usually asymptotic estimates are used because different implementations of the same algorithm may differ to a degree in efficiency. However the efficiencies of any two “reasonable” implementations of a given algorithm are related by a constant multiplicative factor called a hidden constant.
Exact (not asymptotic) measures of efficiency can sometimes be computed, but they usually require certain assumptions concerning the particular implementation of the algorithm, called model of computation. A model of computation may be defined in terms of an abstract computer, e.g., a Turing machine, and/or by postulating that certain operations are executed in a unit time. For example, if the sorted set to which we apply a binary search has N elements, and we can guarantee that a single binary lookup can be done in a unit time, then at most log2 N+1 time units are needed to return an answer.
Exact measures of efficiency are useful to programmers who actually implement and use algorithms, because they are more precise and thus enable them to know how much time they can expect to spend in execution. To these programmers, a hidden constant can make all the difference between success and failure for their application.
Informally, a sorting algorithm can be said to exhibit a growth rate on the order of a mathematical function if beyond a certain input size n, the function f(n) times a positive constant provides an upper bound or limit for the run-time of that algorithm. In other words, for a given input size n greater than some no and a constant c, an algorithm can run no slower than c*f(n). This concept is frequently expressed using Big O notation. For example, if the run-time of a sorting algorithm grows quadratically as its input size increases, the sorting algorithm can be said to be of order O(n2).
In addition to the number of operations required for a particular sorting algorithm, another factor that can significantly increase the time required to sort a set of data is the amount of data that needs to be copied (or swapped) to different memory locations during the sorting algorithm. As can be appreciated, when data that is relatively large (i.e., complex) is sorted using an algorithm that involves a large number of copying and swapping, the time requirements can be overly burdensome. As used herein, a “complex object” is generally any grouping of data or object that requires more than a trivial amount of memory to store the data (e.g., requires more than the amount of data required to store an integer). Thus, sorting algorithms that involve a large number of copying and swapping of the data to be sorted will take longer to execute when sorting complex objects than when sorting “simple objects,” such as integers. The time difference will generally depend on the actual size of the complex objects to be sorted.
Therefore, there remains a need for systems and methods that facilitate the efficient sorting of complex data objects. Preferably, such systems and methods would provide a sorting algorithm that is capable of sorting the complex objects faster than previously known systems and methods.