With the advent of high speed information processing systems, it is now possible to process large databases built from information originally collected on paper documents. If these documents are printed documents, automated character recognition systems have been developed that have a high probability of correctly reading printed data and converting that printed data into ASCII codes usable by the computing system. A problem arises where the characters on the documents are handwritten cursive characters.
Character recognition systems designed to recognize handwritten cursive characters are well known and have been under development now for at least three decades. At this point, one can expect a handwriting recognition system to read approximately 50% of the cursive words whose images are scanned into the computing system. The unrecognizable words must be manually examined and keyed into the computing system by operators. For low volume systems handling a few hundred documents a day, this is not a problem. However, for large database systems dealing with hundreds of millions of documents, the manual examination of the documents followed by key entry of the information on those documents is not an acceptable alternative.
For example, in a database system maintaining genealogical records, it would be desirable to be able to scan images of census records and read the individual names on these records. Most of these census documents contain handwritten cursive records. Billions of documents have been collected over many centuries of keeping such records around the world. If, for example, there are documents containing two billion handwritten cursive census records, and if manually reading and keying in records can be done at the rate of two million records a year, it would take one thousand years to manually enter all of the handwritten cursive record information on these documents. Even applying the best cursive character recognition technology available at this time, which is 50% successful, the number of records to be manually entered is only cut in half. To complete the task of manually entering these records into the computing system, the number of years in this example is reduced only from one thousand years to five hundred years.