1. Field of the Invention
The present invention relates to an encoding technique for a structured document.
2. Description of the Related Art
Conventionally, in the XML language specification formulated by the W3C, it is a common practice to encode data by a character encoding method such as UTF-8, UTF-16, or the like upon describing data in an XML language. When data to be described as attribute values and element contents are other than characters such as integers, decimals, and the like, if these data are encoded as characters, the size becomes larger than original data, thus requiring a longer parsing process time.
With binary XML techniques such as the Fast Infoset (ISO/IEC 24824-1) formulated by the ISO and the like, attribute values and element contents in XML data can be encoded not only as characters but also as original data types such as integers, decimals, and the like. As a result, the data size can be compressed, and the parse process time can be shortened.
Note that Japanese Patent Laid-Open No. 2005-215950 discloses a technique that attains compression by replacing character strings such as element names, attribute names, and values of XML data, which repetitively appear a plurality of number of times, by shorter byte strings.
However, the conventional techniques cannot often obtain a compression effect for data including coordinate values such as graphics, maps, drawings, and the like.
The conventional binary XML technique encodes a decimal value in an IEEE754 (IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985)) format as the data format on a computer, so as to shorten a parse process time. Upon encoding decimals in the IEEE754 format, no compression effect can be obtained for decimals having small digit counts such as −0.1, 0.2, and the like since the encoded size requires at least 4 bytes. Hence, in case of a structured document which describes many decimals having the small digit counts like an SVG document, the parse process time can be shortened, but a high compression effect cannot be obtained.