Effective information utilization techniques of digital structured document information occupy very important positions in a wide range of exchange/distribution of information including the Internet. For example, these techniques represented by XML (extensible Markup Language) have been developed toward information environments based on WEB, and are standardized as structured languages.
However, a systematic language processing technique associated with automatic analysis of structured document structures and transformation into other structured documents is not available. Conventionally, in order to extract required pieces of information from information of input structured documents and to combine and output them as a structured document having another structure, the structure of the input structured documents, and that of the structured document to be output must be recognized, and generation of XSLT (XML Transformations) and programming for extracting information from structured documents and outputting it as a structured document with a new structure are made.
As the aforementioned prior arts, for example, techniques described in two following patent references are known.
Japanese Patent Laid-Open No. 2004-30582
Japanese Patent Laid-Open No. 2004-38334
However, if the structures of the input structured document and that to be output are not known in advance, generation of XSLT and programming that considers the input structure and that to be output cannot be made, and it is difficult to extract required information from the input structured document and to output it as a structured document with a new structure.