The present invention relates to techniques for analyzing an image showing a graphical representation.
Takeda et al., U.S. Pat. No. 5,228,100, describe techniques for producing a form display from a document image. The form display has blank fields and a program to input data to the blank fields. As shown and described in relation to FIG. 1, a document processing apparatus includes a processor, an image input device for storing image data in memory, a printer for achieving a print operation of data from memory, and memory for programs and for data items. As shown and described in relation to FIG. 2, a document format recognition step recognizes an image of a document format to determine a format information item, a document construction step generates document content data associated with the document format, the system creates output data for the document data based on the resultant format and content data, and a document output step prints the output document data on a print form or stores it in a data file. FIG. 3 shows an example of a document form supplied from an image input device and FIG. 4 shows a document example produced by a printer.
Takeda et al. show and describe in relation to FIGS. 8, 9-a, and 9-b, in a physical structure recognition step, how an area is subdivided into a plurality of blocks, and the system judges whether or not a selected block has a type representing a table. The judgement may be accomplished such that, for example, when the block has a horizontal width and a vertical height respectively exceeding predetermined threshold values, the block is determined to belong to a table. The block is subdivided into subblocks or subregions and a subblock is selected and recognized. The system selects one of the subblocks and recognizes whether or not it has a type of an area constituting a cell. If so, a recursive call is made to further recognize the physical structure of the area.
Takeda et al. show and describe in relation to FIGS. 10-13 how an area is subdivided. FIGS. 14-a through 14-c show examples, with FIG. 14-a showing an area recognized after a first block division step on the original image in FIG. 3; FIG. 14-b showing the configuration of an area recognized when an area division step is achieved on blocks judged to be associated with a table in FIG. 14-a; and FIG. 14-c showing an area recognized when block division is accomplished for areas judged to be cells in FIG. 14-b. Area type recognition is shown and described in relation to FIGS. 15-19. Construction element recognition, including character recognition, is shown and described in relation to FIGS. 24-33.
Takeda et al. show and describe in relation to FIGS. 41 through 66-b how a document's logical structure is recognized. As shown and described in relation to FIGS. 42 through 45-b, positions and sizes of characters, ruled lines, etc. written in a document are normalized to align the items with respect to a row and column. FIGS. 46 through 50-b and FIGS. 51-b through 66 show alternative logical structure recognition techniques.