The field of the invention covers a method and apparatus for extension of collation functions of a sorting program to records having collating characteristics not recognized by the sorting program. More particularly, this invention provides an improved sorting program for performing SORT/MERGE or INCLUDE/OMIT functions on input strings of records possessing collating characteristics foreign to the sorting program.
In the prior art, sorting programs are understood to be software entities executable upon computing systems for the purpose of selectively collating input record strings to produce output record strings. In this regard, collating is taken in its broad sense to include the arrangement or assembly of records according to an orderly system. For example, input lists of alphanumeric records having a random order are collated by a sorting program to produce alphanumerically-processed output lists. Such output lists might include all of the input records in alphabetical order in a single output file, a single alphabetized list produced by combining a plurality of input lists, or a filtered list which includes only records meeting certain comparison conditions. In this latter regard, such a list might include constructing a file of San Diego, Calif. addresses from a mailing list embracing all of the United States.
In the context of the invention to be described below, the collating functions of a sorting program include those functions characteristically associated with sort program control statements such as SORT, MERGE, INCLUDE, and OMIT. Implementations of these sorting program collating functions can be understood with reference to descriptions of identically-named verbs in the IBM sorting program DFSORT, as described in DFSORT (Data Facility Sort) Program Product No. 5740-SMI Application Programming Guide: R8.0, Order No. SC33-4035.
The use, operation, and application environment of the DFSORT program are also described in detail in U.S. Pat. No. 4,587,628 of Archer et al., assigned to the assignee of this application. The Archer et al. patent is incorporated in whole by reference into this application.
As is known, collating functions performed by sorting programs are executed against records having collating characteristics that are recognized by the sorting program. In this regard, EBCDIC or ASCII coding are among the most widely-understood systems of collating characteristics. Indeed, IBM's DFSORT program performs its collating functions on data records having control fields containing EBCDIC or ASCII code characters. This provides the DFSORT program with the ability to perform collating functions on input strings of records comprising characters drawn from the English alphabet and the decimal and binary numbering systems. However, EBCDIC or ASCII-based sorting programs do not recognize written characters drawn from non-English alphabets. Thus, without adaptation or extension of its collating facilities, an EBCDIC or ASCII-based sorting program cannot collate an input string of data elements formed from, for example, Kanji characters. Manifestly, the ability to process input lists of records drawn from a variety of national alphabets would substantially increase the market appeal and considerably heighten the usefulness of sorting programs.
One successful adaptation of a sorting program to the collation of non-EBCDlC data elements is the OS/VS SORT/MERGE program-Kanji/Chinese, described in IBM Publication No. SH18-0016-0, "OS/VS SORT/MERGE PROGRAM-KANJI/CHINESE," RPQ Reference No. 7F0094 Program Description and Operation Manual. Hereinafter this program is referred to as SORT/MERGE-Kanji/Chinese (SMKC). The SMKC program performs input processing on non-EBCDIC record lists, invokes a sorting program such as the IBM OS/VS SORT/MERGE program or the DFSORT program to collate the processed records, and then performs output processing on the records to provide output files. The SMKC program, therefore, treats the DFSORT sorting program as an adjunct process, callable when needed. However, this program imposes a substantial overhead burden on the facilities that support its execution, results in a proliferation of resources to support of collation of record sets with divergent collating characteristics, and fails to take advantage of the superior input/output processing capability of the DFSORT sorting program.
The prior art SMKC program provides for modification of input string data elements through the translation of Japanese-language collating characteristics to counterpart binary data characteristics, which enables the called OS/VS SORT/MERGE sorting program to permute or combine input strings of Japanese-language characters. This translation is accomplished by means of reference to a variety of collating sequence tables which map Japanese-language collating characteristics into binary data counterparts. However, in the prior art SMKC program, no provision is made for altering control statements to forms executable by a sorting program such as OS/VS SORT/MERGE or DFSORT.