1. Field of the Invention
The invention relates to the field of data isolation preparatory to data classification, particularly for optical character recognition (OCR).
2. Prior Art
Numerous commercial systems are available for recognizing characters, and the like. In one class of these systems, alphanumeric characters are optically scanned and the resultant information is processed so as to recognize the characters. In some of these systems, a hand-held wand or reader is manually moved across printed characters and the characters are optically scanned by the reader.
The recognition of characters scanned by hand-held readers is particularly difficult since precise registry of the characters relative to the optical system of the scanner is not possible because of the manual movement of the reader. For example, there is a tendency for the operator to cause the reader to drift above and below the characters, particularly when "reading" a long line.
In processing the data received from such readers, the location of the characters within the field of view of the reader is not known, making it more difficult to recognize characters. In some cases, considerable fields of data which do not contain characters are processed. In other cases, complicated circuitry is used to find locations of the characters.
As will be seen, the present invention teaches a matrix extractor which isolates from scanned fields areas having characters, or the like. Only data within these areas is extracted for processing. This, of course, reduces the total amount of data required for processing and simplifies the recognition process.
OCR systems employing hand-held readers are described in U.S. Pat. Nos. 4,075,605 and 4,180,799.