Conventional compression and decompression algorithms are well known and widely used with many types of data transmission and reception. One such discussion of conventional compression and decompression techniques is the publication RFC 1951. RFC 1951 describes a lossless compressed data format.
Compression and decompression algorithms are commonly employed for point-to-point, or one-to-one system interconnections, as well as broadcast transmissions, or one-to-many interconnections. Compression algorithms reduce the representation of the information, but do not reduce the actual information. Compression algorithms create a narrower bandwidth, or a smaller number of overall bits within a data stream. Compression and decompression of a data stream is necessary when the transfer channel architecture does not directly correspond to the data to be transferred. Compression is also routinely used to accelerate the transfer of large amounts of data.
XML is a trimmed-down version of SGML (Standardized General Markup Language). XML was designed specifically for web-based data and documentation applications. XML provides a generic base on which to build other languages describing various types of data, such as XHTML for web pages, MathML for formulae etc.
The popularity and rapidly growing application of XML has also created problems with system architecture constraints relating to bandwidth and throughput capability. XML is a technology applicable to representing arbitrarily large data sets, however as data sets grow, the computational expense of handling this data also grows. Oftentimes in a medium to a large XML based spreadsheet file, millions and possibly hundreds of millions of customizable tags may be found. Almost all of these tags may appear in a given document repetitively, and the individual tags may themselves contain several kilobytes of information.
Further, another constraint with conventional XML documents is apparent when a user attempts to transfer the document intersystem. The raw file size of such a document can be very large. Therefore, when transferring such a large file, throughput issues with, for example, bandwidth become apparent with very lengthy transfer times.
The common approach for parsing compressed XML document files first requires an encoder or similarly tasked processor, to expand the compressed data. However, after the input file has been expanded or decompressed, the file must undergo a second encoding process, which is parsing the file. Oftentimes the parsing process is of considerable complexity, and further degrades throughput.