This disclosure relates to a device, method, and program for processing a structured document (data) with a tree structure such as an XML document or the like, and extracts a desired portion thereof.
Recently, structured documents (data) that are written in a structured language such as XML and the like are widely used in a variety of applications. Information systems that use this type of data include a system that receives a search request from outside, and extracts and returns a portion of the data that corresponds to search conditions. In such a system, there are cases where information of a portion of the structure that the original data has, corresponding to a data portion to be extracted is also requested as a search result.
For example, when performing a keyword search for an XML document with a tree structure, extraction of a tree structure (subtree) that includes text portions corresponding to the keywords is requested in addition to the text portions (partial data or subdata extracted from the original data and including its structure information is hereinafter referred to as “subdocument”). A technique for extracting a desired subdocument from an XML document has been proposed.