Prior art optical character recognition (OCR) techniques fall into two major categories. The first technique is geometric OCR. The second technique is color coded OCR.
Geometric OCR attempts to recognize a character based on the character's shape or the geometric representation of a set of pixels or dots. A character as used herein is meant to include a printed or written symbol which can be recognized by an OCR device or a human reader. The character can be an alphabetical symbol or an icon. Furthermore, the term pixel and dot will be used interchangeably to describe a distinguishable point recognizable by an OCR device. In such a geometric OCR approach, color is used only to define the shape of a character. Even if characters are represented by multiple colors, the multiple colors are converted to either black or a gray scale before shape analysis. Such an approach can provide a recognition accuracy as high as 99.5%. However, higher degrees of accuracy are desired. In addition, significant data storage is required for each character shape to be recognized. This means that a geometric representation of the shape of each character of the alphabet plus the other symbols to be recognized has to be stored. This data storage is redundantly duplicated for each character font supported. This means that not just one representation of the geometric shape is stored for the character "a", but that representations of the geometric shape for Prestige, Elite, Gothic, Roman, etc. versions of the character "a"0 are stored. Furthermore, computer processing time is required to compare to all of the stored shapes. Again from a redundancy standpoint, redundant computer processing time is required to compare to multiple fonts.
The second technique of color coded OCR attempts to recognize a character based on the character's color. In such an approach, color is used not only to indicate the shape of a character, but also to indicate the identity of the character. For example, "a" is printed red, "b" is printed blue, and "c" is printed yellow. Color coding OCR eliminates the data storage and computer processing requirements of geometric OCR by eliminating shape processing. Color coding OCR also can provide higher recognition accuracy rates than those of geometric OCR as it is not subject to shape processing errors. However, prior art color coded characters cause a severe visual distraction to a human reader because such prior art color coding is distinguishable to a human reader. In addition, special OCR printing apparatus and special OCR reading apparatus are used for prior art color coded characters.
The two prior art OCR approaches present four major difficulties. The first two difficulties are the substantial storage and substantial computer processing required by geometric OCR shape processing. If these shape processing difficulties are avoided by using color coded OCR, then the third difficulty is the severe visual distraction of color coding to a human reader. A fourth difficulty is the special printing and reading devices used by prior art color coded OCR.
Thus, there is a need for an OCR approach which can substantially increase the accuracy rate of optical character recognition techniques while overcoming the deficiencies with both conventional approaches.