There have been known OCR (optical character reader) functions that optically read hand-written or printed characters and compare the read characters with previously stored patterns so as to identify the read characters.
Character strings to be read may include touching characters that are touching another character and isolated characters that are each one character.
Typical OCR functions include a function that determines whether a pattern is a touching pattern candidate, which may be touching another character, or an isolated character pattern candidate, which is highly likely to be one character, in order to increase recognition efficiency or avoid an extraction error due to an increase in the number of character extraction candidates.
If it is determined that the pattern is an isolated character pattern candidate, the pattern is directly subjected to one-character recognition.
On the other hand, if it is determined that the pattern is a touching pattern candidate, the pattern is subjected to segmentation (extraction and recognition).
Among known methods for determining whether a pattern is touching pattern candidate are a method of determining that a target pattern, whose length in the character string direction is equal to or larger than a threshold, is a touching pattern candidate and a method of determining that a target pattern, where W/H>K where W is the width of the pattern, H is the height thereof, and K is a constant, is a touching pattern candidate.
The above-mentioned methods use the nature of a touching pattern whose length in the character string direction is generally larger than that of an isolated character pattern.
However, there are cases where although a pattern is a touching pattern the pattern cannot be distinguished from an isolated touching pattern on the basis of the length in the character string direction or the width-to-height ratio (W/H). In such a case, it is not determined that the pattern is a touching pattern candidate, causing erroneous character recognition.
If the threshold is lowered to avoid such an erroneous determination, the probability with which it is determined that a pattern is a touching pattern candidate is increased somewhat. However, the number of touching pattern candidates is increased accordingly. Thus, the number of characters to be extracted is increased so that an erroneous segmentation result may be selected.