Clients may backup data files to a data recovery system. Data files may be merge sorted as they are read from the client machines at the client end or the server end. As data are sorted, it is critical to reduce the memory utilization of the backup programs to reduce failure rates due to insufficient resources such as memory.
Merge sort is an efficient general-purpose comparison-based sorting algorithm. It is a divide and conquer algorithm. Conceptually, a merge sort works by dividing an unsorted list into N sublists, each containing one element, and repeatedly merge the sublists to produce new sorted sublists until there is only one sublist remaining. In sorting n objects, merge sort has an average and worst case performance of O(N log N).
One drawback of merge sort, when implemented on arrays is its O(N) memory requirement to make an auxiliary array of size N for sorting n elements. For example, to sort an array of 100 elements with elements of size 1 MB, an auxiliary array of 100 MB or an additional of 100 MB of memory allocation is required to merge sort. If an array of 100 elements has element of size 100 MB, an auxiliary array of 10 GB or an additional of 10 GB of memory allocation is required to merge sort. A need had arisen to reduce the memory allocation for a merge sort operation to reduce back-up failure rates due to insufficient memory resources.