The eXtensible Markup Language (XML) has become the most popular format for exchanging information between applications. XML content is self-descriptive (i.e., it contains tags along with data), but the standard XML serialization format is text-based, including the numbers and dates. This results in a significant increase in the size of XML documents compared to other proprietary formats for capturing the same data. The increased size of XML documents causes overhead costs during transmission, due to limited network bandwidths, as well as slower performance of storage and retrieval operations, due to limited disk I/O bandwidth.
Processing XML data typically requires parsing the tags to access the values. DOMs (Document Object Models) can be used, but they typically require a lot of memory. Thus, the parsing step can be costly and can cause significant application performance degradation.
Further, the values may need to be converted from the textual representation to their native datatype (e.g., integer, float or date) before the values can be processed by the application. Associated type conversion costs also degrade overall application performance.
Approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.