The present invention relates to the field of image processing. In one embodiment the invention provides a method and apparatus for identifying and/or separating type or print styles.
Morphological image processing techniques are well known and have been used in a wide variety of applications. Such techniques have been used in, for example, shape identification. Such morphological techniques use known morphological steps such as "open," "close," "erode," and the like.
Often in document processing it would be desirable to locate and, in some cases, separate text which is discriminated from remaining text by, for example, type styles such as bold or italic type styles. At the present time text in a bold or italic type style is generally identified manually. This process is exceedingly tedious and time consuming.
Also, many documents and their images contain both machine printed text and handwritten annotations. It would be useful to be able to identify regions of a scanned image that correspond to handwritten or handprinted annotations. For example, current OCR systems, as well as foreseeable future OCR systems, are not able to reliably recognize handwritten annotations in an image. When such text is fed to a conventional OCR system, such systems will often produce unusable results. The performance of such systems could be improved if handwritten regions could be identified to avoid processing such regions in the OCR process.
On the other hand, identification and retrieval of handwritten annotations on documents are sometimes important. For example, an image filing system would make use of handwritten notations by saving the annotations (and their coordinates) along with an OCR-ized version of the image. In general, if the handwritten annotations are identified as such, the system can save them as bitmap data, to be fed back to the user in a way that is appropriate to the application.
While meeting with some success, prior methods of separating text have met with a variety of limitations. Some of the prior methods require equipment which is expensive, complex, and/or unreliable, while other techniques require significant amounts of computer memory, computer time, or the like. Some of the methods are less than reliable in detecting and separating handwritten annotations.
From the above it is seen that an improved method and apparatus for identifying special type styles such as bold and italic type styles or for detecting the presence of handwritten annotations and, if present, separating them from machine printed text in a document or image.