This invention relates to a character reading system in which a character of low quality, portions of which are blurred or collapsed, can be read with high accuracy.
In a character reading system, one of the essential factors affecting the reading accuracy is the print quality of an input character. In the case of a printed character, a portion of the character line may be blurred, collapsed or malformed when the type surface is dirty, the printing pressure is unsatisfactory, or the ink ribbon is irregular in density. In the case of a hand-written character, the character line may be partially blurred, collapsed or malformed when the writing tool is unsatisfactory, the writing speed is variable, or the way the person writes characters is unacceptable. Especially when characters are written with a ball-point pen, because of the irregular rotation of the ball, the ink does not flow out smoothly, as a result of which the characters are frequently blurred.
As is apparent from the numeral "0" and the character "C" for instance, in order to read a character, it is most essential to determine how the character lines are separated or connected. Accordingly, the accuracy of reading a character, the character lines of which are blurred or collapsed as described above, is, in general, low.
In order to read a character of low quality, a character reading system has been proposed, as disclosed in IBM J. Res. Develop. 1975, p. 354-363 by M. R. Bartz, in which a threshold value is calculated from the average density level and average line width of a character pattern, and one-dimensional local contrast along the character scanning direction is utilized to convert the input character pattern into binary data, e.g., black and white picture elements. This system is quite effective in reading a printed character with noise formed by splashes of ink and a lower case character, the character lines of which are liable to collapse. However, since the system is provided mainly in order to improve the accuracy of reading printed characters, the system is not so effective in reading hand-written characters in which the density level and the line width are considerably variable.
In order to eliminate the drawbacks accompanying this system, two-dimensional local contrast has been utilized as disclosed in Pattern Recognition, Pergamon Press 1974, Vol. 6, p. 127-135, by J. R. Ullmann. More specifically, when the difference in density level between one picture element and picture elements near this picture element in a two-dimensional plane is larger than a predetermined threshold value, the picture element is regarded as a black picture element; and when the difference is not larger than the threshold value, the picture element is regarded as a white picture element. This system is effective in reading a character, the lines of which are relatively low in density or a portion of which is relatively high in density. However, the system is disadvantageous in that when the character line is partially irregular in density, a portion of the character line is determined to consist of white picture elements, or a light stain is considered as black picture elements.
In another known character reading system, as disclosed in British Pat. No. 1,263,467 to Watanabe, a threshold value of high density level and a threshold value of low density level are provided, and when a picture element is connected with a picture element which is determined to be a black picture element by binary-coding with the high threshold value, and is itself determined to be a black picture element using the low threshold value, it is determined to be a black picture element. This system is advantageous in that it can readily eliminate a localized noise element spaced from the character and it is effective in reading a character the lines of which are blurred. However, the system is disadvantageous in that it cannot eliminate a noise area connected to the character, and it is not effective in reading a character, a character line of which is collapsed. In this system, the setting of the low threshold value is difficult. If the low threshold value is increased, the system is not effective in reading a character having blurred line portions. If, on the other hand, the low threshold value is decreased, the system is not effective in reading characters having collapsed line portions. Especially in a character of low quality, the grey level of collapsed line portions is, in general, higher than that of blurred line portions, and accordingly, it is impossible for the system to correctly convert both these portions into black and white picture elements.
The above-described conventional systems each include at least one drawback. In common with these systems is that the fact that a character formed with character lines is not reflected at all or not sufficiently reflected. As was described above, character information is essentially transmitted by the connection and separation of character lines, and the widths of character lines are not essential in reading the character. Accordingly, a character reading system should determine blurred line portions as black picture elements and collapsed line portions as white picture elements, to thereby correctly detect the connection of the character lines. For this purpose, the system should utilize the fact that a character pattern, unlike an ordinary figure, is a special pattern made up of character lines.