In computer science, compression is the practice of representing data in a more compact form by reducing the amount of redundant information that is included when data is stored or communicated. Lossless compression is a kind of compression in which the original data can be exactly recovered from the compressed data, that is, from the data in the reduced format. Lossless compression can be useful when even small changes to the data may significantly change the meaning of the data.
Various resource tradeoffs may arise in conjunction with compression. For example, significant gains in coding efficiency (that is, in the relative reduction in size of the data) may require inordinate expenditures of limited computational resources. More generally, the various advantages and costs of compression can be viewed in a context of overall resources and objectives that varies according to the circumstances.
Compression may be applied to various kinds of data. In particular, compression may be applied to structured documents which are normally human-legible, such as XML documents. XML provides a set of rules for the electronic encoding of documents, which are sometimes used in protocols for exchanging structured information in the implementation of web services, for example. Lossless compression may allow such documents to be exchanged with improved coding efficiency and without changing their meaning. Strings are often one of the predominant variable-length features in an encoded XML document. In general, predominance indicates that compression of some kind may be worth considering. As with compression generally, however, the resources and objectives of XML document compression may differ according to the circumstances, so having a variety of possible approaches can be helpful.