It is often desirable to sort a set of data, such as to arrange entries in alphabetical order. Different languages may have different sorting rules and conventions. For example, ö is sorted before z in Swedish, but after z in German. In some cases, variants also may exist within a single language. For example, German typically uses a different sorting order (the “German Phonebook” order) for phonebooks and similar publications than is used in other cases.
Support of language-specific sorting requirements may be complex, with increasing complexity as additional languages are to be supported.
To address this issue, sorting a collection of records in a database may be accomplished by using a sort key. A sort key typically is a string of bytes that encapsulates the sorting order for a string. Different sort key techniques may generate different sort keys. For example, the keys generated by the International Components for Unicode (ICU) software for the word “Töch” are:
4D 43 2B 35 01 85 9D 06 01 8F 08 00 for [Dutch, German]
4D 43 2F 2B 35 01 86 87 07 01 8F 08 00 for [German (Phonebook Sort Order)]
4D 43 36 04 01 85 9D 05 01 8F 07 00 for [Slovak]
4D 5A A3 06 2B 35 01 08 01 8F 07 00 for [Swedish]
The sort key may include a set of weights separated by a level separator that indicate how a string should be sorted. For example, the Dutch/German key above includes values for primary, secondary, and tertiary weights, 4D 43 2B 35, 85 9D 06, and 8F 08 00, respectively, separated by the level separator 01.
When the sort keys in a particular language for a set of data items are ordered, they in turn provide the appropriate sort order for the underlying data items. For example, a data set may include the following names and associated sort keys for the English language:
NameSort KeyJohn SmithA0 19 A9 23Alice Roberts8B 9H DD 91Alice Reynolds8B 9H 00 C3Robert JonesDD 97 9A 4DWhen ordered by sort key using conventional English sorting rules (0-9, A-Z), the sorted data set is:
NameSort KeyAlice Reynolds8B 9H 00 C3Alice Roberts8B 9H DD 91John SmithA0 19 A9 23Robert JonesDD 97 9A 4DThus, the sort keys provide the appropriate sort order for the associated data, without having to apply additional sorting rules to the data directly.