Optical character recognition devices read text from a printed page and produce an output representative of the text in an appropriate format, e.g. ASCII code. There are two basic techniques available in optical character recognition. One is based on matrix matching and the other on feature extraction. Matrix matching involves comparing a bit mapped image of a character with a set of templates. A major disadvantage of matrix matching is that it is limited to certain fonts and is very susceptible to character spacing. This is a particular drawback in relation to proportionally spaced characters.
Feature extraction involves recognition of selected features of characters and comparison of those features with stored versions of the features on which the system has been trained. Techniques based on feature extraction have the advantage that they are better able to read many different fonts. However known feature extraction methods are complex to implement and have a slow operating speed.
The present invention provides novel aspects which have particular application to optical character recognition based on feature extraction. Some of these methods and apparatus have more general application in the field of image processing.