1. Field of the Invention
The present invention relates to a structure retrieval apparatus in which tags (part of the tags are omissible) are inserted in data to discriminate portions of the data to thereby express a structure, and the structure of the data is searched at high speed. For example, the invention is applicable to an apparatus for searching the structure of a structured document in which tags are inserted in a text to divide it into document elements.
2. Description of the Related Art
Conventionally, in document editing apparatuses for document processing, such as document editing apparatuses (word processors) in workstations, in order to efficiently prepare a document, attempts have been made to structure and edit the document by preparing in advance a plurality of document parts such as headers and paragraphs and by determining relationships among the respective document parts.
As examples of structured documents which incorporate the concept of a structure with respect to a document, structured documents conforming to international standards of ODA (ISO 8613: Open Document Architecture) and SGML (ISO 8879: Standard Generalized Markup Language) are known. As for an example of a document processing method using a structured document conforming to ODA standards, reference is made to Japanese Unexamined Patent Publication No. Hei. 5-135054 entitled "Document Processing Method. "
Structured documents conforming to SGML, which have high affinity with conventional text processing systems, have found widespread use principally in the United States, and have already entered a stage of practical use. This is because a conventional text processing system is sufficiently capable of realizing the structured document since the technique of the structured document conforming to SGML is a technique whereby the document text is partially classified (e.g., divided as document parts) by inserting marks called tags into the document text, the document is structured by defining relationships among the divisions, and a tree-structured document structure is thereby represented.
Next, by citing a structured document conforming to SGML as an example, a description will be given of an example of processing a structured document provided with marks. In a structured document conforming to SGML, a pattern of a document structure is provided in advance, and the structure of the document is constrained within the range of the provided pattern. Such a pattern of the document structure is called a document type definition (DTD) in SGML.
In a structured document conforming to SGML, a document type definition is first set forth to regulate the structure of the document. Next, to represent the structure, marks called tags are inserted in the document text, and the document text is partially classified by the tags. For example, one paragraph in a document is represented as shown below by using a tag &lt;para&gt; having a name "para."