Manufactured products are frequently provided with legends that serve to identify the product. Such identifications, for example, are directed to particulars such as article number, manufacturer and type of execution.
Legends applied with color stamps are often too temporary since they become easily illegible due to scratches, rust or foreign colored materials, such as lacquer. Coined legends prove more durable and, for example, are frequently employed in the field of automobile manufacture. A distinction is made in coined characters between imaged or struck legends and raised legends.
Although a bar code can be automatically read more easily than such characters, it is seldom used since the identifications could also be easily read by other persons. There is therefore a great need for a method that recognizes coined characters in automated production processes.
Optical character recognition (OCR) is well known in the prior art for pattern recognition of digital image processing. For example, optical reader equipment are already used in banks and in post offices, see, for example, Ullmann, J.R., Picture Analysis in Character Recognition in Digital Picture Analysis, Edited by A. Rosenfeld, Springer 1976, pages 295-343. They have different requirements concerning the print image to be read and of the text layout. However, what they all have in common is that they expect characters that noticeably differ from the picture background in terms of color or brightness.
This pre-requisite is generally not met for characters coined in workpieces. Depending on the type of coining, on the illumination, on the material of the surface and any possible contamination of the workpiece, the labelling thereon does not uniformally contrast with the background. Consequently, a binary image is not available, only a digital gray tone image is available and therefore known methods for optical character recognition cannot be employed.
FIG. 1 shows portions of gray tone images of punched characters that were recorded under different illumination conditions.
Two sub-tasks form the foundation for the process of optical character recognition in gray tone images, namely, first segmenting for identifying at which locations characters are present in the image, and second classifying or recognition. For example, let a rectangle in which a character is assumed to reside be extracted from the image. A determination must then be made as to which character is present or, as warranted, whether a missegmenting was carried out.
Complete systems for character segmenting and recognition in gray tone images are disclosed in the publications of Hongo, Y., Komuro, A., "Stamped Character Apparatus Based on the Bit Matrix Method", Proc. 6th ICPR, Muenchen 1982, pages 448-450; German published application 3 203 897; and Langer, W., "Segmentierung von Schriftzeichen in Grauwertbildern", Degree Thesis, Technical University of Braunschweig Inst. fuer Elektrotechnik, 1988. All three systems employ methods that first convert the gray tone image into a black-and-white picture and then execute the segmenting and recognition.
In the references of Hongo and Komuro the original image is converted into a binary representation by establishing a gray tone threshold and then by assuming the characters have a planar structure with small disrupting areas. The latter are then eliminated by evaluating their size. The method is not suitable for recognizing coined characters without an additional application of color. However this method can be used, for example, for automobile identifications and labelled keyboards. The same is true of the method disclosed by German published application 3 203 897.
In the reference of Langer the disclosed method also segments the characters in the binary image, whereby, however, it is proposed to use alternative segments in the case of an unclear classification. The binary segmented characters are pre-processed in order to produce planar patterns. A method of CGK (Computer Gesellschaft Konstanz) is used for classification that recognizes planar binary characters. The main drawback of the Langer method is the selection of parameters dependent on the original image and the selection of the sub-steps for image pre-processing. Moreover, the classification by the CGK method requires an exact segmenting of the binary characters in a rectangle. Given the presence of disrupting areas, this cannot be guaranteed in every case.
European Patent 0 217 118 A2 discloses a method for segmenting wafer numbers from gray tone images. According to this method, the original image is first "smoothed" and is then converted into a black-and-white image with reference to a global threshold. Using a single-pass method, the components of the image are coded into simple polygon trains ("descriptors"). Maximum expanses are calculated for every polygon train and small noise elements are eliminated on the basis of the resulting values. An individual character segmenting and a classification are not disclosed. The extremely narrow coining of the wafer numbers facilitates the segmenting significantly since the characters themselves are not subject to any brightness fluctuations (no reflections). Without such specific, prior knowledge about the image, an image smoothing is generally not recommended since features important for the classification can also be lost in addition to any disruptions of the image.
A method for the classification of characters from "DMA sequences" is disclosed in the publication by Holder, S., Dengler, J., "Font and Size-Invariant Character Recognition with Gray Value Image Features", Proc. 9th ICPR, Rom, 1988, these sequences are composed of the leters "A", "C", "G" and "T" and are usually printed in an extremely small type face with poor printing quality and a variable character set in trade publications. The method employs the gradient image of the digitized gray tone image for acquiring the features since a good binarization can generally not be achieved by establishing a gray tone threshold due to the poor quality of the original image. The gradient directions and their directional changes in the course of the contour of the letters presented are entered into a histogram. The histogram is compared to reference histograms of the four letters that were previously produced on the basis of sample letters from various character sets. The method works in size-invariant fashion and nearly independently of the character set of the original. A method for segmenting is not disclosed. The algorithm used is not suitable for punched characters since the important information about th original changes of the gradients in the course of the contours of the characters is not reliably present for these characters. For example, an "L" could thus not be discriminated from a "T" merely with reference to the gradient histogram.