1. Field of the Invention
The present invention relates to a data compression apparatus, a data decompression apparatus, and a method for compressing data.
2. Description of the Related Art
XML (eXtensive Markup Language) defined by the Worldwide Web Consortium (W3C), which is a standardization organization, is widely used as a language for describing data exchanged over the Internet. Using XML, data is represented as a collection of parts, such as elements and attributes, and is represented as a structure. In addition, an XML schema is used as definition information about elements and attributes used in XML data. Examples of a language for describing an XML schema include an “XML Schema” defined by the W3C and a “Relax NG” defined by the International Organization for Standardization (ISO).
Data written in XML has a text format including elements and attributes written with character strings called “tags”. In addition, a technique is proposed for compressing XML data so as to reduce the size of the XML data using a technology called “binary XML” (refer to, for example, Japanese Patent Laid-Open No. 2005-215951). In a binary XML technology, a character string, such as an element name and an attribute name, included in structured data is compressed by replacing the character string with a predetermined code using a conversion table. By using such a code, the information size of the character string can be reduced.
However, before data is compressed, a conversion table used for indexing element names and attribute names needs to be generated. In addition, such a conversion table needs to be generated for each type of structured data or each of languages defined by an XML schema, such as the SVG language. That is, in order to compress a plurality of different types of structured data, a conversion table is necessary for each of the types of structured data or each of the types of language used for describing the structured data.
Accordingly, when XML data describing, for example, device setting information, are acquired from a plurality of devices located in a network using, for example, a web service and are stored in a single apparatus, system resources, such as the capacities of a memory and a recording medium, may be wasted. That is, even when the schemas for the device setting information of a plurality of versions are almost the same, redundant conversion tables including the same vocabulary items need to be generated for the different versions.