In recent years, use of a structured-document format called XML (extensible Markup Language 1.0, W3C Recommendation, 10, Feb. 1998) has been rapidly spreading in the fields of EC (Electronic Commerce) and B2B (Business to Business). XML is a data format expressing a hierarchical document structure by tags. With use of XML, annotation information such as the logical structure of a document, the types of stored data, and the like can be embedded in the document, separately from the body of data. This kind of document can be converted into a processible format easily by a calculator and allows various related information to be managed together. Unified management can hence be easier.
In some aspects, however, there are demerits from the viewpoints of individual application programs. That is, even those application programs that require only specific types of data have to exclude unnecessary data while processing data. Therefore, extra time and costs are demanded.
In general, an application program (XML application program) which handles an XML document converts the XML document into a data model such as DOM (Document Object Model Level 1, W3C Recommendation 1, Oct. 1998) by randomly accessing respective elements of the document, and then processes the document. DOM is a standard of an object-oriented data model which has a general-purpose interface designed supposing access from various XML application programs. If an XML document is developed on a main memory with use of a data model like DOM, calculation resources (calculation time, necessary memory) which are necessary for converting an XML document of a text format into the data model like DOM, increase as the document size increases. This has been a problem.
In case of handling a large amount of structured documents of XML or the like, a database is used as a storage of the structured documents. There are well-known systems of storing documents into a database, e.g., a system of mapping documents on RDB (Relational Database), a system of handling elements of a structured document as objects and storing the elements into ODB (Object Database), etc. When a structured-document application program processes a document in these systems, however, an original structured-document is regenerated from information stored in a database and then delivered to the structured-document application program. The documents to be treated therefore have large sizes. As the documents are more complicated, costs for picking up and delivering the documents increase.
Further, there is a case that the processing in the structured-document application program requires only a part of a document. For example, an ordinary search system needs to pick up only the titles of documents which satisfy search conditions and to output the titles in a search list. Unless the user requests access to the entire of a document, the entire part of the document is not required.
To solve these problems, compressed structured-documents are delivered according to a conventional technique. For example, “Proposals for XML data compression method based on element name compression: Simplified Element XML” (by Shouhei Yokoyama, Manabu Ohta, and Hiroshi Ishikawa), and “IPSJDBS/ACM SIGMOD Japan Chapter/JSPS-RFTF AMCP Joint Symposium concerning Database and Web Information System: pp. 331-337” (December 2000) disclose a technique of reducing delivery costs for XML documents by compressing tag information. Japanese Patent Laid-Open No. 2001-236261 discloses a technique in which a means for extracting partial elements of XML documents is provided in the data server side, only those partial elements requested from a structured-document application program are delivered, and a cache mechanism for partial elements is provided in the client side where the structured-document application program operates, to reduce delivery costs for XML documents.
However, it is impossible to save calculation resources used by a structured-document application program although delivery costs for structured-documents can be reduced with only the delivery of compressed structured-documents. According to the Japanese patent Laid-Open No. 2001-236261, partial elements are extracted and delivered without distinguishing structure information from contents. Therefore, contents are extracted even with respect to those parts that are referred to by a structured-document application program. As a result, the delivery costs cannot be reduced sufficiently. These methods do not take into consideration restriction information concerning the document structure represented by the document type definition (DTD) in XML documents. It is hence possible that a document of a type which cannot be processed by a structured-document application program may be delivered. For example, in some cases, a structured-document application program of a type which verifies the document structure with use of DTD fails to verify a delivered document and cannot thereafter continue subsequent processing.
The present invention has been made to solve these problems, and has an object of providing a document delivery device, a document delivery method, a document delivery program, and a document delivery system, which are capable of reducing costs for delivery to structured-document application programs by extracting and delivering only information necessary for the structured-document application programs, and capable of calculation resources used by the structured-document application programs, and a document receiving device which receives delivered structured-documents.