Data compression, also known as source coding or, mostly in communication systems, as bit-rate reduction, is the process of encoding the bits structure of data in such a way that its size (typically expressed in bits), will be smaller than the original size. Data decompression is the process of reconstructing the original data from the compressed data.
The main benefits of data compression are smaller storage size for the data, and faster data communication. The main drawbacks are computing resources needed for the compression and for the decompression, and, in communication systems, increased latency.
Data compression is divided into lossy and lossless types. In lossy compression techniques the reconstructed data is not always identical to the original data. It is useful in applications where some inaccuracy is acceptable; for example, in image compression. In contrast, in lossless compression, all bits of the reconstructed data must be identical to the bits of the source data.
A cornerstone technique for lossless data compression is Huffman compression, described by Huffman, in “A Method for the Construction of Minimum-Redundancy Codes,” Proceedings of the IRE, 40 (9), pages 1098-1101.
Another common compression technique is Lempel-Ziv (LZ) compression, wherein the encoding tree is built and refined while the source data is read.