The present invention relates generally to image analysis and recognition of whole words, phrases or numbers. This includes a Fourier transformation and pattern recognition of offline text that does not require intraword character segmentation.
OCR automation of businesses is dependent on a machine's ability to recognize the input (words, phrases, or numbers) and act according to preprogrammed instructions. Unfortunately most inputs are not in a form compatible with automation. An example is the postal service. The average person does not address his letter with a bar code label format. People address their mail with words written on one side of the envelope or package. No two persons' handwriting is the same and no one can write exactly the same way each time. In addition, words and numbers are inseparable because they touch one another or the print is broken and incomplete. It is not difficult for the human mind to recognize most handwriting but the complexity to build a computer system to do the same has yet to be achieved.
U.S. Pat. No. 4,764,973 provides a preliminary background. It provides the rudimentary steps to input a piece of document text (whether on a printed page or parcel package), scan the text, and to recognize the scanned information. However, the previous algorithm was not capable of handling a large dictionary with many font styles. It was only sufficient as a testbed for research and for use with studying dyslexic readers. The background section of the patent provides more discussion of the problem, and includes references for background information and prior studies in this field.