The following solutions to this contemporary problem are already known:
So-called "dedicated" optical readers have been developed for particular applications, and are used in highly limited contexts. This category includes, in particular, devices for automatically reading postal addresses for automated sorting, devices for providing assistance to the blind, and digital inputting devices for banks.
Optical readers which are limited to a few fonts of characters, which are of wider application since such machines are capable of reading a limited number of character fonts, and which can be used in particular in an office environment or for electronic mailing. As a general rule, the documents that can be read by this type of machine must satisfy rather restrictive quality conditions concerning inking and contrast, for example.
Optical readers capable of reading a wide range of fonts. These machines are more advanced than the above machines in that they are capable of learning to read a new character font. This improvement in reading performance is offset by a nonnegligible learning time. Further, the quality required of documents being read is similar to that in the preceding case.
Under these conditions, it is clear that genuinely satisfactory character recognition means are not currently available.
Dedicated machines suffer from a very limited field of application, and in addition they need to make direct contact with the documents they read which constitutes yet another limitation.
The most important constraint on the other two types of machine lies in the quality of printing required: for example it is difficult to process photocopies which is a considerable handicap in office-type applications, in electronic mail, or in in-house communications.
With the second type of machine, the fact that the number of fonts which can be recognized is limited is also a handicap since it is impossible, outside certain special situations, to enforce the use of a standard policy for general application.
As for the third type of machine, their major drawback lies in learning which is expensive in time, and it should also be observed that it is not easy to check that the results are good after learning has taken place. All of this reduces the flexibility of a system in use.
The aim of the present invention is to remedy the drawbacks of existing equipment.
One of the aims of the invention is to provide character recognition means which are "multi-font" and capable even of reading handwriting providing the hand-written characters are isolated (block letters) or are at least separable.
Correspondingly, another aim of the invention is to make it possible to establish invariant representations of characters, i.e. representations which are independent of the particular character font being used.
The invention also seeks to provide character recognition means which are invariant with respect to the width of character strokes, character size, serifs, and to some extent of orientation in order to be able to recognize italic or slanting characters as well.
In other words, the invention seeks to provide character recognition means capable of being easily integrated into any system that requires such character recognition.
Further, the invention also aims to provide means capable of being applied equally well to character images which are "black and white", i.e. in which optical intensity is expressed as an on or an off, and to images having a gray scale. This makes it possible to escape from the requirement for high quality in the original document being processed.