In the area of text image processing, there have been attempts for structuring components of textual information. One example of such a process is that a certain portion of text is made associated with another part of the text so that a reader can select relevant information based upon the textual association. Such text with internal associations is called hypertext. To accomplish such associations or links, certain languages such Hyper Text Markup Language (HTML) have been implemented.
Based upon the above described hypertext concept, one prior art attempt such as Japanese Patent Publication Hei 7-98708 discloses a system which selects relevant information from original text according to a reader's characteristics. For example, if the reader holds a certain position in a company, a predetermined type of information is available to him or her. By the same token, depending upon age and sex of a reader, certain predetermined information is displayed.
Another prior art attempt discloses a method of generating a HTML-based data. To generate the HTML data, text is scanned in as textual image, and based upon certain predetermined characteristics of the textual image, a certain portion of the text is associated with another portion of the text. "A Method of Automatically Generating HTML Based upon Image," Fujii et al. Technical Report, Academy of Electronic and Communication, OSF 95-23, IE 95-55 (1995).
Lastly, a third and fourth prior attempts disclose similar methods of generating a HTML-based link between a figure and a corresponding portion of text. To establish a link, the text image is first tentatively divided into text segments and figure segments based upon relative positions within each column of the text image. Within a figure segment, a caption area is determined, and its characters are recognized. A certain predetermined set of characters such as "FIG. 1" is detected in the recognized character data and is used as a label for the figure. Then, the label is further searched in the text segments, and the corresponding text portion is linked to the figure. "Generating Hyper Text From Printed Text for Electronic Library, " Image Recognition and Understanding Symposium (MIRU), Ohira et al., (1996); "A Highly Applicable Method of Structuring Printed Text," Koyama et al., Proceeding at Academy of Electronics Information and Communication, (1995).
The above described prior attempts generally fail to disclose a flexible automatic link generation method for text containing figures such as diagrams, tables, equations, drawings and etc. Although the above described third and fourth prior art attempts disclose a link generation between a figure and a corresponding portion of text, the link generation method requires a corresponding legend or caption area and relies upon predetermined labels or a set of characters. These requirements are not necessarily met for every text image.