Over the past several years, people have sought to solve larger and more complex problems. This has prompted the development of faster and more powerful computers. These more powerful computers include those which utilize a plurality of processors working in conjunction with one another, and are thus called multi-processor computers.
These multi-processor computers can be classed into either tightly-coupled (i.e., shared memory) and loosely-coupled (i.e., unshared memory) computers. As the names imply, shared memory computers are those in which all of the processors share the same memory, whereas in unshared memory computers each processor has its own memory which it does not share with the other processors.
The different types of multi-processor computers are particularly well suited for working with certain types of applications. Unshared memory computers usually work well in applications that allow data to be separated into blocks that reside exclusively in the memories of the individual processors and do not have to be brought together during execution of the application. Transferring data between processors is usually costly in these types of computers, and thus they attempt to limit the communication required for a particular application. Shared memory computers, however, work well with applications that require repeated access to all of the data by all of the processors. The increase in speed which is possible from these multi-processor machines cannot be realized without appropriate, efficient software to take advantage of the parallel architecture. Thus, more efficient software is also a vital part of increasing the speed of problem solving. However, developing applications for a parallel processing environment is quite different and usually more complex from developing applications for the more traditional, single processor computer environments. Yet to utilize a multi-processor computer to its fullest, efficient applications are a necessity.
A key aspect of many large and complex computer applications often involve sorting. Thus, the faster a computer environment can perform sorting, the more quickly a particular problem can be solved. An example of applications which rely heavily on sorting are those involving the manipulation and analysis of large databases. Both hierarchial and relational databases use sorting to organize database records and indices into the databases. As the size of these databases continues to increase in size, the speed of sorting will become an even more significant component in the overall efficiency of an application.
In a typical application, it is advantageous to first create several sorted lists from the records in the database(es) being used. In a multi-processing environment, this can be done by allowing each processor to sort some portion of the database in parallel with the other processors. Even in a single processor environment, however, it is still advantageous to create several sorted lists as a first step toward sorting the entire database. These lists then need to somehow be merged together to form a final, sorted list.
Historically, applications involving large amounts of data have used large, single processor machines to process (e.g., sort and merge) the data due to storage and processing requirements. However, the use of multi-processing has resulted in a new emphasis on parallel applications for sorting and merging which take advantage of these new multi-processing architectures and at the same time provide good performance.
The applications typically take two forms depending upon the type of multi-processing computer used. For an unshared memory computer, the sort phase is easily parallelized by just sorting the data that exists on each processor locally. The merge phase, however, is more difficult because data records that will end up next to each other in the final output list are spread across all of the processors. Determining how the records are to be broken up to form this final output list is a difficult task, and causes unshared memory computers to perform poorly in merge operations.
For a shared memory computer, the data is accessible to all processors equally. The sort phase of the overall sorting process can then be parallelized by assigning roughly equal portions of the records to each of N processors regardless of storage location. Each of these portions is called a task. The end result is N sorted tasks in the shared memory which now must be merged to create a single sorted list. Since these tasks comprise portions of several sorted lists, a task actually comprises a plurality of sorted sub-lists. These sub-lists need to be merged to form a single, sorted task.
One scheme for using a shared memory computer to sort records is described by Iyer et al. in the article "An Efficient Microprocessor Merge Algorithm," (P. J. Varman, B. R. Iyer, D. J. Haderle, PARABASE 90 International Conferences On Database, Parallel Architectures and their Applications, Miami Beach, Fla., March 1990, IEEE Computer Society Press, Cat. No. 90CH2728-4, pp. 276-83). However, Iyer et al. requires an excessive amount of access to the storage device which contains the records when the volume of records is too large to fit into shared memory. Thus, Iyer does not efficiently account for the situation where the database being used is too large to fit into shared memory.
Thus, what is needed is a scheme for using a shared memory computer to sort a plurality of sorted lists and which can efficiently handle a situation where the database is too large for the amount of memory.