The present invention generally relates to sorting apparatuses, and more particularly to a sorting apparatus for sorting data in a predetermined sequence depending on a key of the data. For example, the sorting apparatus is suited for use in a character recognition apparatus when candidate character codes must be sorted at a high speed by taking as the key a distance from a dictionary, that is, the distance between a feature pattern of the candidate character code and a registered feature pattern in the dictionary.
In a kanji (Chinese character) optical character reader (OCR), a feature is extracted from a character image which is to be recognized and a recognition result (candidate) is determined by matching the extracted feature with a standard feature dictionary. When matching the extracted feature with each template (feature pattern of each character in the dictionary), the candidates must be sorted with reference to a key which is the distance in this case.
An example of a sorting will be described with reference to FIGS. 1A and 1B. It is assumed that ten character codes "A" through "J" of the candidates are already sorted with reference to the key which is the distance as shown in FIG. 1A. In this state, when a character code "K" of a candidate is matched with the template and the distance is "80" (this will be indicated as (K, 80)), this character code "K" is inserted between the character codes "C" and "D" as shown in FIG. 1B.
On the other hand, there are cases where a plurality of templates are prepared for the same character code for the purpose of improving the recognition rate so as to cope with a multifont, for example. In such cases, it is necessary to take measures so that the same character code is not included in the character codes of the candidates a plurality of times. An example of a sorting in such cases will be described with reference to FIGS. 2A and 2B. It is assumed that ten character codes "A" through "J" of the candidates are already sorted with reference to the key which is the distance as shown in FIG. 2A. In this state, when a character code "G" of a candidate is matched with the template and the distance is "80" (this will be indicated as (G, 80)), the same character code "G" is already included in the character codes of the candidates. Hence, this character code "G" is inserted between the character codes "C" and "D" as shown in FIG. 2B and the character code "G" which has the larger distance "120" (G, 120) is deleted from the character codes of the candidates as shown in FIG. 2B.
In the kanji OCR, the number of candidates output for the after-processing is generally ten and the character codes of ten candidates are shown in FIGS. 1A through 2B for this reason. Normally, a kanji code such as a kanji (or graphic) shift code is used as the character code, but the character codes are denoted by alphabets in FIGS. 1A through 2B for the sake of convenience.
Conventionally, the above described sorting is carried out by a software processing using a central processing unit (CPU) and a memory. However, it is necessary to compare the distance of the character code of each candidate from the character codes of the registered candidates to determine the inserting position of the character code and thereafter shift the position of the registered candidates with character codes having the distances greater than that of the inserted character code. The shifting of the registered candidates involves a repetition of reading and writing of the registered character code, and it is difficult to shift the registered candidates at a high speed. Furthermore, a similar operation is required to shift the registered candidates when eliminating the double registration of the same character code.
On the other hand, there is a proposed sorting system which writes the character code by taking the value of the distance as an index of the memory address. But this proposed sorting system suffers a problem in that a considerably large memory capacity is required depending on a maximum value of the distance which is considered valid as a candidate. In addition, this proposed sorting system is unsuited for realization on a single semiconductor chip. Moreover, an independent process must be carried out to exclude the double registration of the same character code.
Furthermore, a pipeline merge sorter, a bitonic sorter and the like which carry out the sorting by hardware have been proposed mainly for application to data base systems. However, such systems suffer a problem in that the circuit construction becomes too complex for the purpose of sorting the candidates in a character recognition apparatus, for example.