XML (Extensible Markup language) has been widely used as a common data storage language. Since there is large data redundancy in XML documents, dedicated XML compressing methods are usually used in practice to compress the data of an XML document. Generally, there are two common XML compressing methods:
The first compressing method is one that does not enable querying. To query and obtain some XML data from an XML document compressed with such a method, the XML document needs to be decompressed first, and then queried to obtain the results.
The second compressing method is one that enables querying. Such a method enables querying and obtaining some XML data from a compressed XML document directly. This method uses query languages such as XPath, XQuery, etc., for querying the compressed XML document. These query languages retrieve information from an XML document based on a path, which is a sequence comprising all the nodes that have been visited between a certain node to a target node in the XML document. The path may be denoted by a string encoding the visited nodes and some special symbols. XPath does not focus on textual representation of XML data, but operates over an underlying abstract logical structure tree. The name of XPath is derived from using a path representation similar to that in an Uniform Resource Identifier (URI) to travel and locate in the hierarchical structure of the XML data. To achieve the main purpose of XPath for locating a specific data segment in an XML document, basic strings, numerical values and Boolean processing functions have been provided in the XPath specification.
The conventional art cannot effectively compress and decompress an XML document having a corresponding Schema and enable query operation in a compressed state.