The present invention relates to the field of image processing. In one embodiment the invention provides a method and apparatus for identifying and/or separating machine printed text and handwritten annotations in an image.
Many documents and their images contain both machine printed text and handwritten annotations. It would be useful to be able to identify regions of a scanned image that correspond to handwritten or handprinted annotations. For example, current OCR systems, as well as foreseeable future OCR systems, are not able to reliably recognize handwritten annotations in an image. When such text is fed to a conventional OCR system, such systems will often produce unusable results. The performance of such systems could be improved if handwritten regions could be identified to avoid processing such regions in the OCR process.
On the other hand, identification and retrieval of handwritten annotations on documents are sometimes important. For example, an image filing system would make use of handwritten notations by saving the annotations (and their coordinates) along with an OCR-ized version of the image. In general, if the handwritten annotations are identified as such, the system can save them as bitmap data, to be fed back to the user in a way that is appropriate to the application.
While meeting with some success, prior methods of separating handwritten annotations and machine printed text have met with a variety of limitations. Some of the prior methods require equipment which is expensive, complex, and/or unreliable, while other techniques require significant amounts of computer memory, computer time, or the like. Some of the methods are less than reliable in detecting and separating handwritten annotations.
Accordingly, an improved method and apparatus are desired for detecting the presence of handwritten annotations and, if present, separating them from machine printed text in a document or image.