1. Field of the Invention
The present invention relates to optical character recognition systems, and to methods of reducing background noise.
2. Background Art
Optical character recognition (OCR) is the optical reading of image data from a document followed by passing over the image data with a recognition system. The recognition system reads characters of a character code line by framing and recognizing the characters within the image data. An existing recognition system is the single pass recognition system that attempts to frame and recognize characters within the image data from the document in a single pass. The single pass recognition system often will read a character and then discard the corresponding image data to make room for the next part of the image data.
A significant problem inherent with optical character recognition (OCR) systems including the existing single pass recognition system arises in the presence of background noise such as borders and background pen marks. Background noise may result in incorrect framing, leading to incorrect recognition. That is, background noise may be mistakenly framed as the character instead of the actual character being framed. Incorrect framing is possible because image data from a read zone or scan band containing the character code line is sent to the recognition system without knowing the position of the characters within the image data.
More specifically, in an OCR system image data from a read zone or scan band of a fixed size is passed to the recognition system to allow the determination of the characters on the document. The scan band is established so that all characters on a single line of text on the document are contained in the image data which takes the form of an image data stream. This stream of image data often contains background noise and borders produced by printing details or pen marks from signatures and other handwritten notations. Noise reduces the single pass recognition system's ability to locate and frame the true code line characters contained in the image data.
For the foregoing reasons, there is a need for a method of reducing background noise to yield a higher percentage of properly framed characters.