The present invention relates to a letter pitch detection system which automatically detects the pitch of a letter for individually segmenting aligned letter images on a paper surface.
In order to recognize a series of letters in an optical character recognizing (OCR) system, it is necessary to separate the letters from each other. Information necessary for separating the letters from each other includes the letter pitch. The letter pitch may be known in advance if the size or kind of printed matter being read by the OCR system is known. However, the types of documents handled by OCRs has widened in recent years to include a range of postal matter and documents of indefinite letter pitch so that it is often impossible to know the letter pitch in advance. This makes it necessary to estimate the letter pitch from the aligned letter images on the paper surface.
In estimating letter pitch according to the prior art, information is extracted from the paper surface regarding the width of one letter image, for example, the mean pitch between letters. When, however, the widths of each of the letters differ according to the font or category of the letters, as with English printed letters, when the letters are being separated, an error is generated depending on the difference between the mean letter pitch and the actual letter pitch. Therefore, when aligned letter images containing several letters touching or merging with each other are separated by the use of the mean letter pitch, the number of merged letters thus separated may be misjudged, or they may be separated by inaccurate dividing lines.