The present invention relates generally to database sorting methods and systems, and, more particularly, to a multi-column, multi-data type internationalized sort extension method for web applications.
Perhaps one of the most fundamental tasks to the operation of computers is sorting, which refers the process of arranging a set of similar information into a desired order. While employed in virtually all database programs, sorting routines are also extensively used in many other areas. Common examples include compilers, interpreters, and operating system software. In many instances, the quality and performance of such software is judged by the efficiency of its sorting techniques. Since sorting methodology plays such an important role in the operation of computers and other data processing systems, there has been much interest in seeking ways to improve existing systems and methods.
Historically, techniques for sorting information are divided into three general methods: exchange, selection, and insertion. To sort by exchange, a system swaps or “exchanges” out of order information until all data members are ordered. One well known example of exchange sorting is the “bubble sort,” which implements repeated comparisons and attendant exchanges of adjacent members. The efficiency of a bubble sort is dependent upon the number of possible comparisons (which increases with a greater number of elements to be sorted) and the number of exchanges required by the sort (which increases the more the list to be sorted is out of order). The end result is that the execution time approaches a multiple of the square of the number of elements, making the bubble sort unusable for large sorts.
With a selection sort, a system continually chooses or “selects” a data member from one extreme of possible values (e.g., such as the lowest-value member) until all members have been selected. Because the system always selects the lowest-value member from those remaining, the set will be ordered from lowest to highest-value member when the process is completed. As is the case with a bubble sort, a selection sort is also too slow for processing a large number of items.
In a sort by insertion, the system examines a data member and places or inserts that member into a new set of members, always inserting each member in its correct position. The sort is completed once the last member has been inserted. Unlike the other two sorting techniques, the number of comparisons that occur with this technique depends on the initial order of the list. More particularly, the technique possesses “natural” behavior; that is, it works the least when the list is already sorted and vice versa, thus making it useful for lists which are almost in order. Also, the technique does not disturb the order of equal keys. If a list is sorted using two keys, the list will remain sorted for both keys after an insertion sort.
A particular concern for any sort method is its speed (i.e., how fast a particular sort completes its task). The speed with which an array of data members can be sorted is directly related to the number of comparisons and the number of exchanges that must be made. Related to the characteristic of speed is the notion of “best case” and “worst case” scenarios. For instance, a sort may have good speed given an average set of data, yet unacceptable speed given highly disordered data.
One technique for reducing the penalty incurred by exchanging full records is to employ a method that operates indirectly on a file, typically using an array of indices, with rearrangement done afterwards. In this manner, any of the above sorting methods may be adapted so that only n “exchanges” of full records are performed. One particular approach is to manipulate an index to the records, accessing the original array only for comparisons. In other words, it is more efficient to sort an index to the records than to incur the cost of moving large records around excessively.
Since all of the simple sorting techniques above execute in n2 time, their usefulness for sorting files with a large number of records is limited. In other words, as the amount of data to be sorted increases, the execution speed of the technique becomes exponentially slower and, at some point, too slow to use. Thus, there has been great interest in developing improved techniques for sorting information. One of the best known improved sorting techniques is referred to as “quicksort,” invented in 1960. Quicksort's popularity is due in large part to its ease of implementation and general applicability to a variety of situations. Based on the notion of exchange sorting, it adds the additional feature of “partitions.”
With quicksort, a value or “comparand” is selected for partitioning the array into two parts. Those elements having a value greater than or equal to the partition value are stored on one side, while those having a value less than the partition value are stored on the other side. The process is repeated for each remaining part until the array is sorted, thus the process is essentially recursive. On the other hand, a recursive technique such as quicksort usually requires that a significant amount of stack-based memory be reserved. Moreover, this technique, which is particularly sensitive to long common substrings, exhibits nonlinear behavior.
Notwithstanding the wide variety of sorting techniques available today, many existing web-based applications are dependent upon back-end systems to perform the actual sorting operations. Although such sorting operations may include sorting of multiple columns, sorting of multiple data types, and even sorting by national locale, there is currently no convenient way of performing front-end, multi-column sorting, multi-data type sorting and/or sorting based on user characteristics (i.e., by locale).