1. Field of the Invention
The invention relates to a method and apparatus for automatically transforming an image for classification or pattern recognition and in particular to a method of automatically verifying or recognizing handwritten or machine printed text.
2. Description of Related Art
U.S. Pat. No. 4,523,331 discloses a computer algorithm and apparatus for automated image input (including recognition), storage and output (including image generation). Each image is transformed into a unique binary number and then stored as such. Means for processing handwriting and colored images are also disclosed. Image recognition and matching takes place by comparing the binary value of the new image received against all images stored in the descending order of difference in binary values. Thus, the computer is able to recognize bad handwriting even when the difference between the ideal or stored samples on the one hand and the new image is substantial without consistency. The computer also stores data about its errors as well as corrections received from the user. For this and other reasons each user has a unique number.
U.S. Pat. No. 4,654,873 discloses a pattern segmentation and recognition system in which handwritten characters are transformed electrically into 2-dimensional image patterns, wherein if ambiguity exists in segmenting a unit pattern including a character from the image patterns, character recognition is not made compulsively, but a plurality of possible unit patterns are first established. Then, the various unit patterns are segmented, and each unit pattern is identified to be a partial pattern, linked patterns, etc., so that each character is recognized on a basis of total judgement, whereby ambiguity of segmentation is resolved.
U.S. Pat. No. 4,972,499 discloses a pattern recognition apparatus which has a contour segmentation unit for dividing an input pattern into segments, a characteristic extraction unit for extracting characteristics of the input segments, and a reference unit for storing characteristic data of reference patterns. The reference unit includes a main reference and a detailed matching reference. The main reference stores partial pattern characteristic data representing the characteristics of segments of each reference pattern. The detailed matching reference stores detailed characteristic data of each reference pattern together with a program for specifying an operation procedures thereof. A matching processor sequentially compares and collates the input pattern with the reference patterns to find out that standard pattern with which the input pattern is matched with the highest similarity. When the input pattern is matched with several reference patterns, a detailed recognition unit performs a detailed recognition process using the detailed characteristic data of these reference patterns to finally select the correct one from among the reference patterns. The main reference additionally stores identification marks to identify specific reference segments necessary to acquire the above detailed characteristic data.
U.S. Pat. No. 4,769,716 discloses an improved method for transmitting facsimiles of scanned symbols. Prototype facsimiles of each symbol in a library are enhanced by averaging the representations of each scanned symbol with a respective previously created prototype facsimile for that symbol. The amount of white space at opposite sides of each symbol prototype is determined. The enhanced prototype facsimile for each scanned symbol is associated with positional parameters denoting the average white space at said opposite sides of each symbol.
U.S. Pat. No. 4,718,103 discloses a system in which a handwritten pattern approximated to series of polygonal lines consisting of segments is compared with a candidate pattern selected from dictionary patterns stored in the memory, basing on the angular variation between adjacent segments of both patterns. If the difference between angular variations of adjoining segments of both patterns is outside of a certain range, it is tested whether the difference between an angular variation across three or more consecutive segments and the above reference angular variation is within the range.
U.S. Pat. No. 4,653,107 discloses a system in which coordinates of a handwritten pattern drawn on a tablet are sequentially sampled by a pattern recognition unit to prepare pattern coordinate data. Based on an area encircled by segments created by the sampled pattern coordinate data of one stroke and a line connecting a start point and an end point of the one-stroke coordinate data, the sampled pattern coordinate data of the one stroke is converted to a straight line and/or curved line segments. The converted segments are quantized and normalized. The segments of the normalized input pattern are rearranged so that the input pattern is drawn in a predetermined sequence. Differences between direction angles for the rearranged segments are calculated. Those differences are compared with differences of the direction angles of the dictionary patterns read from a memory to calculate a difference therebetween. The matching of the input pattern and the dictionary pattern is determined in accordance with the difference. If the matching fails, the first or last inputted segment of the input pattern is deleted or the sampled pattern coordinate data of the next stroke is added, to continue the recognition process.
U.S. Pat. No. 4,284,975 discloses a pattern recognition system operating on an on-line basis for handwritten characters, in particular for hand-written Chinese characters comprising a character input unit for providing the coordinates of a plurality of points on the strokes of a written input character, a classification unit for classifying the input characters to the first group having equal to or less than three strokes, and the second group having equal to or more than four strokes, an approximate unit for providing a plurality of feature points to each of strokes, the number of strokes being six for each stroke in the first group of characters and three for each stroke in the second group of characters, a pattern difference calculator for providing the sum of the length between the feature points of the input character and those of the reference characters which are stored in the reference pattern storage, and a minimum difference detector for determining the minimum length among the pattern differences thus calculated. The input character is recognized to be the same as the reference character which provides said minimum length.
U.S. Pat. No. 4,972,496 discloses a keyboardless entry computer system which includes a transparent input screen that generates positional information when contacted by a stylus, and a display screen mounted physically below the input screen such that a character that is displayed can be seen below the input screen. The system includes a computer that has been programmed to compile the positional information into strokes, to calculate stroke characteristics, and then compare the stroke characteristics with those stored in a database in order to recognize the symbol drawn by the stylus. Key features of the system are: (1) transparent position sensing subsystem; (2) underlying display on which to mimic drawing of sensed positions and to show characters or symbols; (3) means to convert sensed positions first into plotted points and then into recognized characters or symbols; and (4) means to "learn" to associate sensed input positions with a character or symbol.
Unpublished European Patent Application No. 92 116 605.4 of International Business Machines Corporation (the assignee of the present invention) shows a handwriting recognition system using a prototype confusability dialog. The subject matter of this patent application is directed to a procedure for interactive editing of prototypes that are close to each other in prototype space and to on-line recognition of handwriting by prototype matching.
From European Published Application 483,391, a method of automatically verifying a signature of an individual and an apparatus for carrying out this method is known. In the reference signature analysis mode one or more reference signatures of an individual are processed for storing sets of reference diameter values. This mode provides the basis for future verifications based on said reference parameter values. In the signature verification mode one present signature of an individual is processed for creating sets of parameter values to be verified. Depending on the stored sets of reference parameter values and the corresponding sets of parameter values to be verified, it is decided if the present signature is true or false with regard to the corresponding reference signature.
Automatic systems purporting to recognize cursive script writing or other types of written or printed text have so far met with only limited success. The reason for that can be traced largely to the lack of robustness exhibited by the templates and the parameters used in the modelling of handwriting. For example, reference is made to U.S. Pat. No. 4,731,857 which describes a three-step procedure for the recognition of run-on handwritten characters. The recognition algorithm is a template matching algorithm based on dynamic programming. Each template is a fully formed character presumably representative of the writer's average way of forming this character, and the elastic matching scores of the current character are computed for each template. This strategy is vulnerable to the extensive variability that can be observed both across writers and across time.
Accordingly, the systems of the prior art have significant disadvantages and limitations.