1. Field of the Invention
The present invention relates to an index preparing apparatus for preparing an index from a document, a method therefor, also a document search apparatus for searching a document containing an entered search character train, a method therefor, a document search system and a storage medium (memory medium).
2. Related Background Art
The document search apparatus generally presents documents containing a given search key as the result of search. The result of search is given a score according to the level of matching with the searching condition, and a document with a high score is presented as the document of the result of search.
However, in the above-mentioned conventional apparatus in observing the content of the document of the result of search in searching the documents on the WWW, the entire document is presented so that it is often difficult to find a portion matching the searching condition in case the document is long or in case the document contains plural subjects.
The documents on WWW often contain plural information in a document and are often too long for observation at a glance. Therefore, in order to obtain the desired information from the document obtained as the result of search, it is necessary to look for a portion matching the searching condition.
The desired information is difficult to find if the document of the result of search contains information not matching the searching condition.
Also in case of observation with an equipment with a small display area such as a mobile terminal, the desired information alone should be presented since the ability to observe the information at a glance is limited.
In consideration of the foregoing, an object of the present invention is to provide a document search apparatus and a method therefor, capable of dividing an HTML document into segments based on the structure and content thereof, and presenting a segment containing the given search key, thereby providing a portion of the document matching the search condition as the result of search.
Another object of the present invention is to provide a document search apparatus and a method therefor, capable of starting from the search of a fine unit such as a segment and enlarging the unit of search according to the number of the results of search, thereby realizing a document search capable of automatically utilizing plural search units in different manners.
Still another object of the present invention is to provide a document search apparatus and a method therefor, allowing to obtain the intended result of search easily.
The above-mentioned objects can be attained, according to the present invention, by an index preparation apparatus for preparing, in a document, a search index of a searched document containing characters interpretable as a command by an apparatus for processing such document, the apparatus comprising searched document holding means for holding the searched document, document dividing means for extracting, from the searched document held by the searched document holding means, a first segment according to the characters interpretable as the command, cohesion processing means for uniting the first segments according to the correlation thereof to form a second segment, and index preparing means for preparing the search index for each of the second segments.