There is a need in the Optical Character Verification (OCV) industry for an OCV apparatus and method with increased reliability of verification of each character or image scanned into a computer system.
Specifically, there is a need for an OCV method that is different from traditional Optical Character Recognition (OCR) and OCV algorithms. Traditional algorithms are based on analysis of pure bi-level images. Bi-level images contain only two colors or two levels of intensity. Typically these are visualized as black-and-white images. Everything is black (one level) or white (the other level). There exist no in-between levels. Furthermore, traditional OCR and OCV approaches rely mainly on statistical analyses of the foreground pixels of the scanned image.
An example is the case of OCR of black text on white paper. When these types of documents are scanned (or whenever these images are otherwise obtained in computer format), a gray-scale image is normally obtained. The gray-scale image has no xe2x80x9ccolor,xe2x80x9d but rather contains varying intensities of gray. White is represented as a very bright gray, black is represented as a very dark gray. There are many intermediate levels of gray between these two extremes. It is well known in the art how to convert the raw gray-scale image into a bi-level form by choosing a cutoff value. All pixels which have an intensity that is greater than the cutoff are made white (the brightest possible gray-scale intensity) and all other pixels are made black (the darkest possible gray-scale intensity).
A xe2x80x9cGray-Scale Card Imagexe2x80x9d is an image of a card that is scanned with any conventional scanning system that is generally commercially available (e.g. an UltraChek I system). Also, executing an intensity normalization algorithm to a card makes the card image visually appear slightly better than the raw UltraChek I scanned image. However, it does not fundamentally affect the character image processing.
The card image has varying intensities of gray on it. Although the card background is not quite perfectly white, it is generally close to being perfectly white, with a few scattered gray pixels throughout. Heavy characters appear as a very dark gray, with intermediate intensities of gray appearing near their edges. Small characters will appear significantly lighter in intensity that the large characters. This effect is normal and is predominantly due to the resolution of the camera used for scanning the card image. Also, neighboring pixels will generally contribute part of their intensity to a central pixel, resulting in an intensity smearing effect.
A histogram of the pixel intensities for the gray-scale card image is illustrated in FIG. 1. The white background accounts for most of the pixel data of the scanned image. This is indicated as the large spike 20 centered around the 220 mark. Moreover, the spike is so large that it is off the scale. There is also a large spike 22 around the 0 mark accounting for the majority of the black text that appears on the source card. In addition, minor spikes of intermediate intensities are registered as the pixels fade from the full black color of the text to the full white color of the background.
The image represented by the histogram may be converted to a black and white image by choosing an appropriate cutoff value (e.g. 128). Choosing the appropriate cutoff value and marking all intensities above the cutoff as white, all others as black, yields a bi-level, black and white, image.
This technique enables the OCR algorithm to easily recognize the larger characters. However, since many of the smaller characters will become significantly distorted, the OCR algorithm will not be able to accurately recognize these characters. One contributing factor to the quality of the bi-level image is the distortion introduced by the UltraCheck-I image capturing system. The image capturing system tends to blend pixels together, such that physically adjacent pixels on the card actually contribute to logical pixels in the resultant scanned image. For example, the dots used to form the colon characters on the small text will be distorted in intensity by the background pixels that surround them. The blending effect results in fuzzy cutoff values between foreground and background pixels. The distortion can be somewhat reduced by evaluating the pixel intensity histogram in the immediate vicinity of the character itself, rather than making the cutoff decision by considering the entire card.
Regardless of the approach, whenever a cutoff value is used, there typically exists a significant noticeable distortion in the character images once they are converted to bi-level format. However, for a solid color text printed on a solid color background, the bi-level form of the image is typically adequate for a person to recognize the characters.
Once the bi-level form of the image has been created, conventional character processing algorithms will attempt matching the character images that appear on the source image (e.g. the card) with a set of reference character images (e.g. a reference template of characters or images stored in the memory circuits of a computer). Character matching is based on some correlation between the source and reference images. Several different correlation approaches are used by conventional software programs.
In many cases, conventional software programs attempt to isolate individual characters appearing on the source image. This produces a series of discrete character images that may be individually processed. There are other systems that operate based on recognizing larger sequences of characters, not just recognizing one character at a time. However, the majority of software programs operate on one character at a time.
At this point, there is a divergence in the approaches taken by conventional OCR and OCV software programs. Conventional OCR software programs recognize the text without having knowledge of what text should actually be present, whereas conventional OCV software programs verify the text according to a known set of text data.
Typically, OCR software programs operate on a substantial amount of text (e.g. a printed page). Accordingly, the OCR software program must read all the characters on a page fairly quickly. It cannot spend a lot of time on any single character. In contrast, OCV software programs typically operate on a very short string of text (e.g. a dozen characters) and it generally knows what text is supposed to be present at a particular character coordinate location. Therefore, the OCV software program can spend more time analyzing individual characters than the OCR software program can.
Once an OCR software program isolates an individual source character, some type of comparison is made between the scanned character from the source image and a reference character image stored in the memory of a computer, or controller. The comparison is often performed in the frequency domain, rather than in the spatial domain, using the Fourier transform or other similar frequency transformations to convert the scanned image into a frequency domain representation. Character images typically exhibit less variance in appearance when transformed into the frequency domain rather than the spatial domain. Therefore, individual missing pixels are not as relevant in the frequency domain. Therefore, the matching process is able to occur more quickly and reliably.
Although frequency domain comparisons of character images are common and well known in the art, there exists some systems that use spatial comparison techniques. These techniques include computing basic characteristics about the source character image, such as the number of connectors, the number of closed curves, etc. The comparisons are then used to narrow the source characters to a reasonable set of characters that may actually be present in the source image. Subsequently, a statistical match is then performed on the individual pixels to actually recognize the set of scanned characters.
In both frequency domain and spatial domain comparison cases, there are many ways of implementing the comparison logic. Neural nets as well as conventional algorithmic type software programs are commonly used. Neural nets are xe2x80x9ctrainedxe2x80x9d systems that are fed numerous sample images. However, neural nets are not very predictable in how they will actually perform in real life. The neural nets must be trained. If the training is successful, the system is then tried at actually recognizing real characters. The precise layout of the nodes of the neural net is critical because they comprise the trained parameters. Since neural nets are extremely sensitive, it is difficult to predict how well they will perform on variations of a given set of inputs (e.g. switching to different fonts).
The algorithmic approaches involve developing software programs (or code) to perform the character recognition function. There is no xe2x80x9cmagic black boxxe2x80x9d that figures out exactly what an algorithmic character recognition software program is doing. As such, it may often misinterpret character images.
OCV is typically performed in the spatial domain. Comparison of character images is performed in a manner similar to the way a person would do it. For example, a scanned character image is compared against a reference image (e.g. a template in memory). Subsequently, some type of statistical correlation between the two is performed to determine whether the characters match. The operation is similar to the method used by OCR software to compare characters in the spatial domain. However, since OCV is typically performed on smaller sets of characters and has more time available for analyzing characters, more computationally intensive algorithms may be used to perform the OCV function.
There are several problems that conventional OCR and OCV systems are unable to overcome. First, images that are more complex than simple black text on white paper are inherently harder to recognize or verify because there is a sharp division between the foreground and the background. There is no practical method of resolving a source gray-scale image into a simple bi-level image. Second, if images are set on very complex backgrounds it is difficult to separate the text pixels from all the other pixels in the image. In many cases, the pixels in the source image foreground and background are so close in color and intensity that it is virtually impossible to obtain a good separation between the foreground and the background elements. If a black and white representation of the image clearly separating the character (foreground) data from the background data cannot be obtained, the conventional recognition and verification approaches cannot be applied.
Without a clean separation between foreground and background elements, jumping into the frequency domain will not yield predictable results. Addition, the presence of extraneous lines in the image presents significant problems for any conventional OCR system. The extraneous lines that are misinterpreted as being part of the character itself will likely confuse the OCR or OCV systems.
Another problem that must be overcome is resolving the defects introduced by the imaging system. For example, conventional imaging systems significantly distort the card image. In particular, blending of pixel intensities over a large area. The foreground pixels to be scanned are significantly distorted near the edges of the characters by the imaging system. For example, given a dark foreground and a bright background, the pixels near the edges of the characters will show up brighter than they should be because of the blending effect. Scanning the image using a higher resolution camera that crisply defines pixel boundaries will help the recognition process significantly. However, this option may not always available.
For the foregoing reasons, there is a need for an apparatus and method capable of verifying a scanned image utilizing an image verification algorithm based on a topological analysis of the scanned image. There is also a need for an apparatus and method capable of verifying a scanned image using an improved bi-level separation method that incorporates an anti-stroke scoring method with an image outlining method (e.g. a method of analyzing an image area disposed outside of the image""s defined outer boundary to determine how many xe2x80x9cstrayxe2x80x9d pixels exists there).
To overcome the limitations of the related art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the invention is directed to an apparatus and method capable of verifying a scanned image utilizing a verification algorithm based on a topological analysis of the scanned image. The invention is also directed to an apparatus and method capable of verifying a scanned image utilizing a verification algorithm based on an improved bi-level separation analysis incorporating an anti-stroke scoring method with a character outlining method.
One aspect of the apparatus having features of the invention includes an image verification apparatus for verifying images. The apparatus comprising an illumination source, an image scanner, scanning an image and converting the image illuminated by the illumination source into an electronically readable format. The apparatus also includes a computer executing a program, a storage device for storing the electronically readable scanned image in an array of discrete elements of varying intensity having first and second boundaries and a principal portion therebetween. The apparatus also includes a programmable template capable of storing a predetermined image in a separate portion of the storage device, and image recognition logic for verifying the scanned image against the predetermined image.
Another aspect of an apparatus having features of the invention includes a scanned image verification apparatus for verifying scanned images. The apparatus comprising a computer executing a program, a storage device for storing an electronically readable scanned image in an array of discrete elements of varying intensity having first and second boundaries and a principal portion therebetween. The apparatus also includes a programmable template capable of storing a predetermined image in a separate portion of the storage device and image recognition logic for verifying the scanned image against the predetermined image.
A further aspect of an apparatus having features of the invention includes a card manufacturing apparatus for putting indicia on a card and verifying the indicia. The apparatus comprising a plurality of card processing modules arranged to produce a card with indicia disposed thereon and at least one of the modules is an indicia verification module using a varying intensity scanned image of a portion of the card.
An aspect of a method having features of the invention includes a method for verifying a scanned image. The method comprising, scanning an image, converting the scanned image into an electronically readable format, storing the scanned image as an array of discrete elements of varying intensity, and analyzing the array of discrete elements of varying intensity against a predetermined image.
Yet another aspect of an article of manufacture having features of the invention includes a computer program on a storage medium. The article of manufacture comprising a programmable template capable of storing a predetermined image in a storage device and image recognition logic for verifying an image against the predetermined image. Yet a further aspect of a method having features of the invention includes a computer system having a first personalization data base, the first data base including card image data and coordinates identifying the location of the card image data, a second template data base, the second data base including data for identifying image, and a third database, the third database including scanned image data and coordinates identifying the location of the scanned data on the card. The method for verifying the scanned image on the card comprising loading a first personalization data base from a storage device to a first area of memory in a computer; loading a second template data base from a second area of memory in the computer; loading a third scanned image data base into a third area of memory in the computer; and verifying that the scanned image data obtained from a card at a first coordinate location matches the personalization data identified at the coordinate location according to the second template data.