Document authors often provide the ability to a reader to efficiently find a topic within a document. One method to provide such an ability is to give the reader an index containing a list of key topics. A reader can use the index to identify a page number or location within the book or article, in order to obtain more detailed information. In the prior art, the author or editor of the information must manually find the key topics which might be of interest to readers, and then generate the index entry which points the reader from the index entry to the point where the topic is more fully explained.
Word processor programs facilitate index generation by allowing an author to mark an index point (often by entering a special-purpose token into the text), and the word processor thereafter can automatically generate an index by extracting every index point into an index which reflects the current page number of each respective index point. This facility greatly reduces the effort of tracking how page numbers might change as text is added or deleted, however there remains considerable manual effort in identifying which topics should be included in the index, and in adding the index-point tokens.
The term `document` is defined in a broad sense as text and other information stored in one or more computer files. Documents include everything from simple short text documents to large computer multi-media databases.
Prior-art FIG. 1 is a conceptual drawing of a hyperlink. A hyperlink is a link between hyperlink source 72, which is located in a first data file, and hyperlink destination 74, which can be located in the same or in a second data file. Hyperlink source 72 and hyperlink destination 74 are typically displayed on computer screen 52 at different points in time. Three elements that comprise a hyperlink are:
(a) hyperlink source 72, which specifies a key topic to be displayed in a hot area. A `hot area` is a portion of the display screen that, if pointed at and clicked on, will cause the computer to execute computer code such as a hyperlink program 79 which hyperlinks (i.e., causes a branch) to a hyperlink destination 74. (Typically, the hot area is visually indicated by highlighting, such as color, a bold font, blinking or underlining, but it may contain an icon, picture graphic, or other visual indication.) PA1 (b) hyperlink destination 74, which includes information, (e.g., destination location specification 73) specifying the location of the text or picture that will be displayed if the hyperlink is taken. Destination location specification 73 for hyperlink destination 74 is generally stored in the data file containing hyperlink source 72. Hyperlink destination 74 itself can be either in the same or a different data file as hyperlink source 72. PA1 (c) hyperlink computer code 79 that, in response to a `viewer action`, causes hyperlink destination 74 to be displayed in the context in which it appears. Typically, that `viewer action` comprises a viewer clicking on the hyperlink source 72. `Clicking` is defined as pointing with and activating pointing device 54 at a hot area, such as hyperlink source 72. A pointing device can include a mouse, joystick, or other device that is used to select a location on a computer screen and is activated by, for example, depressing a switch such as a mouse button 59, or otherwise indicating that the computer should execute hyperlink code 79. Upon activation, hyperlink code 79 uses destination location specification 73 to locate hyperlink destination 74, and to display that information.
Another technology which is relevant to the present invention is the automatic semantic analysis of text to identify and extract key topics for indexing. One exemplary kernel incorporating this semantic analysis technology, the Syntactica Engine available from Iconovex Corporation, assignee of the present invention, does a syntactic analysis of the text of a document and then uses a "lexicon dictionary" (also called a "lexicon") which specifies semantic weights assigned to the words in the text reflecting their value as index entries. A computer program uses the synthesized values, or semantic weights, for words to qualify phrases as key topics. A user is able to specify a threshold value so that the computer program could select only those phrases greater than, or equal to, that specified threshold value as key topics. Known semantic-analysis computer programs are not available as integrated features in word-processing programs.
A significant problem with generating information for word-processor-based index-generation systems is that the author must review the material to be indexed, must identify key topics to which to index, and must set up the index-point tokens. This is a time-consuming and labor-intensive process.