Lorin, "Sorting and Sort Systems", Addison-Wesley Publishing Co., copyright 1975, at page 1, defines sorting as a process of arranging items in order. He further points out that while ordering can involve physical ordering, such as the arrangement of a deck of punched cards or records on a magnetic tape, the output of a sorting operation does not necessarily involve an actual or physical rearrangement. In this regard, order in a file may be represented in other ways, particularly by use of an index. The rearrangement of an index or its equivalent is termed "logical ordering" or "rearrangement". Thus, where the elements to be sorted are contained in a linked list, then altering the sorting order consists of altering the pointers, which pointers define the succession of elements in the list.
Modern data processing machines comprise an instruction processor coupled to a hierarchically organized and least recently used (LRU) managed staged storage system containing software and data. The fastest, most rapidly accessed memory closest to the instruction processor is placed at the top of the hierarchy, while progressively slower forms of memory having the bulk of the information written thereon occupies the lower positions within the hierarchy. Because memory costs increase dramatically with speed, many computer systems divide the physical memory subsystem into a number of performance levels. Some of these levels, such as DASD and tape, have traditionally been treated as I/O devices while other levels, such as RAM and cache, have been treated directly by system hardware as main memory. The term "primary storage" or "internal storage" specifies system memory that can be randomly addressed for single read or write transfers. "Secondary" or "external storage" refers to storage that is not randomly addressable and is too slow for direct access or must be accessed in fixed-size blocks.
A cache is a memory with an access time considerably faster than the access times of other forms of randomly accessed memory constituting primary or internal store. Because referencing these forms of memory is managed by the system, the existence of a cache is transparent to the application software. Data is brought into the cache usually in lines that contain the reference data, and a line resides in the cache until it is overlaid with another line. The movement of data in and out of the cache is commonly supported by a hardware implementation of a least recently used (LRU) algorithm. The cache operates on the principle that certain memory locations tend to be accessed often. When a main memory location is read, its contents are stored in the cache at the same time. Further, read references to this location are automatically routed to the cache. A write access usually writes to both main memory and the cache. Since a cache may represent many noncontiguous main memory locations, content-addressable registers are used to determine when a main memory location is currently duplicated in the cache.
Traditionally, sort methods are classified into internal methods and external methods. An internal method is one that can be applied with acceptable performance only to those lists of data that can be completely contained within the primary storage of the processor. An external method is one that reasonably applies to files of data that are too large to fit into the primary store and must therefore rely on external bulk storage devices, such as tape or DASD, during the sorting process. In the process of external sorting, parts of a file are read into primary storage, ordered internally, and then rewritten on the external devices. This process may occur a number of times. In contrast, internal methods are used to rearrange the data developed from pass to pass. Restated, most external sorting methods make a first pass through the file to be sorted, breaking it up into blocks about the size of internal memory, and then sort these blocks. A merge is then performed upon the sorted blocks together by making several passes through the file and creating successively larger sorted blocks until the whole file is sorted. For example, if there is an unordered list of n keys and an internal memory capacity m&lt;&lt;n words, then each sort pass produces n/m sorted blocks. If a p-way merge is performed on each subsequent pass, then log (base key) (n/m) passes may be required.