1. Field of the Invention
The present invention relates to a technique for processing a structured document.
2. Description of the Related Art
In recent years, opportunities for handling structured documents are increasing in compact devices such as mobile phones and digital cameras. Also, the data sizes of structured documents to be handled are increasing, and a demand has arisen for attaining high-speed processing of structured documents in environments in which the capacities of memories and storage devices are limited or CPUs have poor processing speeds.
XML structured documents can have a data structure of a tree structure, which is hierarchized by embedding tags in a document. However, when layers become deeper, a large memory capacity is required to store them, and data accesses worsen. As a result, a problem is posed whereby high-speed processing of structured documents is frustrated. Hence, a method of relating an element name of a child element to that of a parent element to replace the parent element with a new element, so as to shorten the hierarchy of the tree structure has been proposed (Japanese Patent Laid-Open No. 2002-297569).
However, with the above method, size-reduction and speed-up effects do not suffice in processing of devices with few resources. To encode data by an XML language, even when data described as an attribute value or element contents is an integer or decimal number, it has to be encoded as characters. At this time, such data requires a larger data size than when it is encoded as a binary expression, resulting in a longer decode time.
In contrast, a technique embodied in binary XML and represented by the Fast Infoset (ISO/IEC24824-1) specification designed by the ISO, is available. Since binary XML can encode an attribute value and element contents in a binary format such as an integer and decimal number suited to their original data types, data size can be reduced, thus speeding up the decode processing.
However, when an attribute value and element contents are described as values of complicated data structures, it is difficult to generally recognize the data structures and to encode them. Hence, such attribute value and element contents have to be encoded as a series of character strings like text XML. For example, SVG data as a vector graphics format can assume complicated values, such as a combination of a drawing command and coordinate information as attribute values. When most document data is occupied by such values, binary XML provides nearly no reduction in data size or analysis processing speedup. Further, when document data includes instructions to alter such attribute values depending upon an amount of elapsed time, the amount of data to be handled as intact character strings increases, thus further reducing the efficiency of the binary XML.
When a structured document includes many attributes that assume values of complicated data structures, the effects of a reduced data size and a speedup of analysis processing cannot be sufficiently obtained when using the encoding.