The invention relates to recognizing text in a multicolor image.
Text recognition techniques, such as optical character recognition (OCR), can identify text characters or objects in an image stored in a computer and convert the text into corresponding ASCII characters. An OCR program can differentiate between text objects and non-text objects (such as the background) in an image based on intensity differences between the text objects and the background. This can be accomplished when the text characters and the background are two distinct colors.
However, the task of recognizing text in a multicolor image is more difficult. For example, an image may include text characters, background, and non-text characters, such as graphical objects, having different colors. Furthermore, different blocks of text in the image may have different combinations of colors. For example, one text block may have red text against a white background and another text block may have yellow text against a black background.