Conventional optical character recognition (OCR) techniques fall into two major categories. The first technique is geometric OCR. The second technique is color coded OCR.
Geometric OCR attempts to recognize a character based on the character's shape or the geometric representation of a set of pixels or dots. A character as used herein is meant to include a printed or written symbol which can be recognized by an OCR device or a human reader. The character can be an alphabetical symbol or an icon. Furthermore, the term pixel and dot will be used interchangeably to describe a distinguishable point recognizable by an OCR device. In such a geometric OCR approach, color is used only to define the shape of a character. Even if characters are represented by multiple colors, the multiple colors are converted to either black or a gray scale before shape analysis. Although the geometric OCR approach can provide a recognition accuracy as high as 99.5%, character recognition errors still occur due to character shape defects and character shape variations. Such character shape defects may take the form of smudged characters or improperly formed characters. Such character shape variations may take the form of character fonts other than those character fonts for which the OCR device is designed to recognize.
The second technique of color coded OCR attempts to recognize a character based on the character's color. In such an approach, color is used not only to indicate the shape of a character, but also to indicate the identity of the character. For example, "a" is printed red, "b" is printed blue, and "c" is printed yellow. Color coding OCR is not subject to the shape processing errors of geometric OCR as it does not perform shape processing. However, color coded OCR may be subject to color processing errors. Such color processing errors may take the form of ink color shifts, improper color density, improper color intensity, or color scanner misalignment.
Thus, there is a need for an OCR approach which can substantially increase the accuracy rate of optical character recognition techniques while overcoming both the shape processing error deficiency of geometric OCR and the color processing error deficiency of color coded OCR.