1. Field of the Invention
This invention generally relates to optical character reader (OCR) systems and more particularly to an improved video enhancement system for enabling an optical character reader to accurately read OCR print with inconsistent print contrast.
2. Description of the Prior Art
In recent years optical character recognition (OCR) systems have been employed to automatically read and electrically process hand-printed or machine-printed alphanumeric characters on documents for data processing purposes.
An OCR system generally detects a character on a document by detecting the contrast between that character and its background paper. This detection of characters is usually accomplished by optically scanning an illuminated document containing characters to be read in order to produce a matrix of picture elements or pixels representative of the optical image of the document.
Before characters can be identified, the matrix of pixels must be changed into some type of machine-readable form. Generally, these pixels are quantized so that they can be more readily processed by subsequent data processing equipment.
A first type of system quantizes the pixels to two binary levels to derive binary pixel data by establishing a threshold level against which each pixel in the matrix is compared. Those pixels equal to or greater than the threshold level are assigned a binary "1" value representative of a black data bit, while those pixels below the threshold value are assigned a binary "0" value representative of a white data bit. Some systems then apply the black and white pixel data to subsequent data processing equipment to identify the characters on a document. However, many systems subsequently apply the matrix of black and white pixel data to an enhancement circuit to simplify and help to achieve more accurate results in the subsequent character recognition operation. Such enhancement circuits usually perform line-thinning and/or line-filling operations to cause the scanned character to become a skeletal representative of the character, which may be two or three pixels wide, while still retaining the basic geometrical information of the original character. Exemplary line-thinning and/or line-filling enhancement techniques on two-dimensional black/white binary data are shown in U.S. Pat. Nos. 3,737,855; 4,003,024; and 4,162,482.
A second type of system uses two threshold levels to quantize the pixels into three digital levels to develop black, gray and white pixel data to try to compensate for different print contrast signals encountered in OCR reading. The term "print contrast signal" (PCS) relates to the contrast between the printed image and the background paper of a document. A printed image with a high PCS is very dark and is best read with a high threshold setting. On the other hand, a printed image with a low PCS is very light and is best read with a low threshold level. This second type of system which quantizes pixels to three levels applies the resultant three-level pixel data to an enhancement circuit to achieve line-thinning and line-filling operations before converting the three-level pixel data into black and white pixel data.
The above-described first and second types of systems can very accurately read batches of OCR imprinted documents if the PCS is consistent from document to document, from character to character and within each character. Unfortunately, a batch of documents will often contain documents with different PCS. A single document can have OCR print that has been encoded by two or more different printers, so PCS variations can be encountered between characters on the same document. PCS variations within a single character are also often encountered with very poor OCR print quality. Inconsistent PCS from document to document, from character to character, and within a character all result in poor readability using just two or even three levels of quantization before enhancement.
The background art known to applicant at the time of the filing of this application is as follows:
U.S. Pat. No. 3,737,855, Character Video Enhancement System, by A. Cutaia;
U.S. Pat. No. 4,003,024, Two-Dimensional Binary Data Enhancement System, by J. P. Riganati et al;
U.S. Pat. No. 4,162,482, Pre-Processing and Feature Extraction System for Character Recognition, by Chanchang Su; and
U.S. Pat. No. 4,345,314, Dynamic Threshold Device, by R. C. Melamud et al.