Document processing and understanding is important for a variety of applications such as office automation, creation of electronic manuals, online documentation and annotation, and so forth. However, the understanding and identifying of graphical objects in large engineering drawings which often can be of the order of 10000xc3x9710000 pixels using traditional methods can be a very challenging and time consuming task due to the sheer size.
In spite of the use of electronic documents there does not appear to have been any significant decrease in the use of paper based documents. In fact, their volume appears to have increased, influenced by an apparently general preference for paper documents for reading and archiving purposes. In a similar way as newspapers remained popular even after the introduction of radio and television broadcasting, and the Web, paper documents remain in widespread use.
However, storing and analyzing paper documents and, more importantly, retrieving them, are a very cumbersome tasks. Electronic documents have the advantage that they can be easily manipulated and readily analyzed. Consequently, transformation of paper documents to electronic form has become an important goal. It is herein recognized that this is a non trivial task and it has been observed, such as in Tang et al., xe2x80x9cMultiresolution analysis in extraction of reference lines from documents with gray level backgroundsxe2x80x9d, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 921-925, 1997, that it is almost impossible to develop a general system that can process all kinds of documents, such as technical reports, engineering drawings, books, journals, newspapers etc. Rather, the research community normally focuses on a specific application so that one can make best use of the inherent properties and the major characteristics of the particular type of document under consideration.
In accordance with an aspect of the invention, a method for identifying graphical objects in large engineering drawings comprises the steps of inputting an original image into a computerized system for image processing; pruning the original image to provide a pruned image; inputting a template image; processing the template image to provide a processed template image; and computing matches between the pruned image and the processed template image.