The proliferation of scanning technology combined with ever increasing computational processing power has lead to many advances in the area of document analysis systems. These systems may be used to extract semantic information from a scanned document, for example by means of Optical Character Recognition (OCR) technology. This technology is used in a growing number of applications such as automated form reading. Document analysis systems can also be used to improve compression of an electronic representation of the document by selectively using an appropriate compression method depending on the content of each part of a page of the document. Improved document compression lends itself to applications such as archiving and electronic distribution.
A significant proportion of office documents are generated using structured text/graphics editing applications such as Microsoft® Word, Microsoft® Powerpoint®, and the like. In addition to formatted text editing, these text/graphics editing applications include basic figure drawing tools and options. An important class of document analysis applications process a bitmap representation of a document to generate an electronic version of the document that can be viewed and edited using such editing applications. Such document analysis applications will be referred to as “scan-to-editable” document analysis applications.
The figure drawing options in a typical structured text/graphics editing application include freeform line drawing, template shapes and connectors (i.e., dynamic line objects that connect to and/or between template shapes within a document). The text/graphics editing applications may also include colouring, filling, layering and grouping options for sets of objects.
Freeform line drawing can be used to draw open and closed objects with straight or curved sections by defining a set of points along the path of the object. A closed N-point polygon may be drawn by defining an ordered set of vertices. For example, FIG. 1 shows a generalised line polygon 100 including vertices (e.g., 101) which are represented by black squares on the line polygon 100. Freeform drawn objects may be filled or empty, with a range of possible fill options including solid colours, transparency, blends and patterns.
Many commonly used geometric shapes can be created using template shapes. A user may prefer to use a template shape rather than drawing the shape using freeform lines as this option can be faster, more accurate in terms of representation of the desired shape, and easier to edit at a later time. The well known Microsoft® AutoShapes set includes a number of examples of template shapes which can be manipulated within editing environments such as Microsoft® Word and Powerpoint®. Other template shapes may be found in OpenOffice™ editing applications such as the Writer™ and Impress™ applications.
Template shapes are commonly found in office documents, in particular in figures and flow charts. Some example template shapes 405 to 460 are shown in FIG. 4. A range of fill options are generally available including unfilled line objects 405 to 420, solid filled objects of the same colour 425 to 440 or different colour 445 to 460 to the line, transparency, blends or patterned fills, as shown in FIG. 4.
As described above, connectors are dynamic line objects that connect template shapes within a document. Such line objects include straight connectors and elbow connectors. The line objects forming connectors may have arrowheads or other line end effects. When a template shape is edited or moved, any connectors connected to the template shape are updated to match the change and remain connected as is illustrated in FIG. 2. FIG. 2 (a) shows three shapes—a rounded rectangle 205, a diamond 215 and a can 225—connected by a straight connector with an arrowhead at both ends 210 and an elbow connector with no end effect 220. FIG. 2 (b) shows the same set of shapes 205, 215, 225 and connectors 210 and 220. However the diamond shape 215 has been moved upwards relative to the other shapes 205 and 225. Both of the connectors 210 and 220 have been updated according to this move—the straight connector 210 is now angled and the elbow connector 220 has flipped in direction. FIG. 2(c) shows the same set of shapes 205, 210, 225 and connectors 210, 220. However, in FIG. 2(c) the diamond 215 has been moved down and right. The connectors 210 and 220 have been updated accordingly, and the elbow connector 220 includes an additional two turns in order to connect the two shapes 215 and 225 without overlap. Examples of line objects forming connectors may be found among the Microsoft® AutoShapes.
In some structured/text editing applications, connectors can only connect to template shapes at a finite set of pre-defined connection points. In general, these pre-defined connection points are located symmetrically and at outward projecting points around the border of a given template shape. FIG. 3 shows a set of unfilled template shapes 301 to 321. Each of the template shapes (e.g., 301) comprises a corresponding set of connection points which are represented as solid boxes located on the boundaries of the template shape. The set of shapes in FIG. 3 include a rectangle (301), round rectangle (302), ellipse (303), triangle (304), parallelogram (305), trapezoid (306), hexagon (307), plus sign (308), star (309), arrow (310), home plate (311), balloon (312), plaque (313), chevron (314), 8-point seal (315), 16-point seal (316), 32-point seal (317), wedge rectangle callout (318), wedge round rectangle callout (319), wedge ellipse callout (320) and wave (321).
Existing scan-to-editable applications tend to be biased towards processing text and tables and typically do not deal effectively with the processing of figures. Such scan-to-editable applications may use optical character recognition (OCR) processing to recognise characters and simple symbols from an image of a document. Many basic scan-to-editable applications simply embed the parts of the image that are not recognised as text or tables as a bitmapped image typically in a compressed format and sometimes at a low resolution. Such basic scan-to-editable applications are clearly disadvantageous to a user as the embedded bitmapped images can not be readily edited using geometric drawing tools, and also the overall file size can be large.
Other applications employ vectorisation methods to find line objects and solid filled objects. The line objects and solid filled objects may be represented as freeform line drawings in the output rather than instances of specific template shapes. This is disadvantageous to the user as specific editing options defined for a template shape will not be available, limiting the usefulness of a corresponding editable electronic version. Such applications also provide less accurate semantic information regarding the content of a document and may therefore be less useful in database applications that rely on this data.
There are a number of existing methods that may be used for the recognition of graphical elements. Some methods recognise entire classes of individual elements or combinations of elements, where different objects recognised as being in the same class are related by an unspecified range of distortions. Such methods are useful for recognising objects which may come in a number of different shapes and unknown distortions (e.g., human face recognition applications). However, these methods do not provide a means of recognising and specifying a particular parametrisation of a pre-defined shape representation for output in an editable document. Other methods recognise particular shapes and specify a limited set of linear transformations but do not deal with the customised combinations of transformations that may be performed on the shapes within a typical template shape library.
A need therefore exists for an improved method of creating an electronic version of a document from a scanned image input.