1. Field of Utilization
This invention relates to electronic filing systems for computer-aided management and retrieval of documents and more particularly to a system and a method for generating an electronic data structure corresponding to a tree or a set of trees representing a model that can be used for understanding the layout of a document or the like.
2. Prior Art
Electronic filing systems for computer-aided management and retrieval of documents have recently become available for use in offices. Current systems, however, essentially store documents just in the form of a picture or image of the document. As a scheme for storing information, therefore, there is no substantial difference between these systems and the conventional paper-based filing. In order to establish an advanced filing system, it is essential to be able to extract specific information on the contents of a document from a document image.
If a system is intended to function as a database storing the entire contents of documents, coding of the whole content of each document is necessary. Even, if the entire contents are not to be stored, entry of coded information for searching titles, publication names, key words, and so on is indispensable. OCR techniques for printed characters are expected to be a tool to support such coding operations. However, in order to extract information for retrieval from a document image by OCR, it is essential to know the position in the document where the necessary information is printed. Additionally, the usage of an electronic filing system varies with application. In some cases, coding is required for a portion of a document, for example, an abstract of a paper. If each time an operator enters a document in a database, he has to specify each of the portions to be recognized, high efficiency in document entry is not to be expected. Therefore, there is a need for a technology to provide efficient computer-aided analysis of a document image and automatic understanding of its layout, i.e., the arrangement of the information therein. Layout understanding includes segmenting a document image into regions of characters, illustrations, photographs, tables, graphs and other image objects, and analyzing the structures of the respective regions. In electronic filing, it is important to interpret character strings as titles, authors, bodies and so on after they are roughly segregated from other regions such as row segments.
The objects of a document image, in general, have a hierarchical structure based On their IS-PART-OF relationships. For example, a document may consist of pages, with each page consisting of a title, author's name, and body, and the body consisting of some columns. The hierarchical structure may be represented by a tree wherein an intermediate node corresponds to one of the objects of a document image, such as the title and so on, and a terminal node (leaf) corresponds to a character string.
A method has been proposed whereby a user specifies a tree-structured model to provide knowledge on the layout, and the system performs analysis in accordance with the model, as disclosed by A. Dengel and G. Barth, "High Level Document Analysis Guided by Geometric Aspects," Int. Journal of Pattern Recognition and Artificial Intelligence 2, 4, pp. 641-655, 1988. As a method for generating a data structure corresponding to a tree in memory, this is an advantageous and feasible method, from the viewpoint of the burden on users, since a user merely describes a tree in a readable form and the system interprets the resultant description so as to generate the data structure in memory. One such example is found in, G. Nagy, S. Seth, "Hierarchical Representation of Optically Scanned Documents," Proc. 7th ICPR [Montreal], 347-349, which proposes an attribute grammar for a user to describe the skeleton of a tree and attributes to be given to each node in the tree. However, this forces users into doing a kind of programming and is not appropriate for end users to describe a model.
Since there are various types of layouts corresponding to various kinds of documents, it is impossible to prepare models for all types of layouts beforehand, and end users are therefore required to describe and modify models. Consequently, easy description and modification of models is an important feature in a system for document image analysis for layout understanding. This requirement of easy description must be attained while maintaining sufficient representation capability to permit a user to cope with various types of layouts. The foregoing method of describing a tree according to an attribute grammar does not satisfy the requirement.
In order for a user to describe a tree, not limited to a layout model, in an understandable form, a graphics editor is generally used for editing a flowchart representing a tree on a display. However, computers available for graphics editors are limited. In addition, although graphics editors are suitable for use in describing the skeleton of a tree, that is, the hierarchy of nodes and the sequence of child nodes linked to a particular node, they are not suitable for use in describing attributes given to respective nodes.
A system for drawing a tree-structure chart is disclosed in JA PUPA 2-589818. This system is directed to charts representing tree structures of program modules. It permits entry of information for drawing a tree-structure chart through a terminal which does not support graphics functions. Since call relationships among modules constitute a tree structure with a main program as its root, the drawing system is a kind of tree gene rating system. With this system, a user describes information on call relationships among modules by using a text editor. FIG. 1 shows an example of how the information on a tree-structure chart is described. Branching of the tree is indicated by key words such as IF, THEN, ELSE and so on in the text. The system interprets the text including these key words, and generates a tree-structure chart as shown in FIG. 2. FIGS. 1 and 2 are drawings shown in the Japanese publication referred to above.
However, since a special key word is assigned for each branching of a tree, end users are require d to understand the meaning of the key words and get skilled in their usage.
Such a method of description is familiar to programmers, but it is not appropriate for end users. It would take much time for end users new to a key board to enter key words, and they would possibly make spelling and grammatical errors. In addition, if a complicated tree consisting of a number of hierarchy levels is expressed with this description method, then the resultant description will be complicated, and the entire hierarchical structure of the tree will be difficult to understand from the resultant description.
Problem to be Solved
The problem shared by the foregoing description methods, in addition to the stated difficulties, is that they can express only one specific tree, that is, they cannot express a set of trees. However, the expression of a set of trees is desirable for some applications. As to the first page of a paper, for example, the existence of objects such as titles and authors is mandatory, but that of footnotes is not. A paper, generally containing two columns on each page, may contain only one column on the last page. To cope with such variations, the conventional methods require that the description be made to include descriptions separately for a tree with a footnote and a tree without it, and for a tree containing one column node and a tree containing two column nodes. Accordingly, the burden is heavy for users attempting to describe a tree-structure for a document conventionally.
It is therefore an object of the present invention to provide a system and a method making it possible for a user to easily describe a tree to be input to memory and capable of generating a data structure representing the tree in memory in accordance with the resultant description.
Another object of the invention is to provide a system and a method making it possible for a user to easily understand a tree to be generated in memory and the resultant description of the tree and capable of generating a data structure representing the tree in memory in accordance with the resultant description.
A further object of the invention is to provide a system and a method making it possible for a user to describe a tree to be input to memory by using a text editor and capable of generating a data structure representing the tree in memory in accordance with the resultant description.
Another object of the invention is to provide a system and a method making it possible for a user to describe a set of trees to be input to memory and capable of generating a data structure representing the set of trees in memory in accordance with the resultant description.