Data compression has recently become a very important topic due to the increasing need of data communication. The increase in data communication requires massive amounts of data to be stored or communicated between systems. Data compression techniques are therefore required to reduce the communication time and reduce the storage requirements in a system.
Data compression and data transfer techniques are used to compress data for storage and communication in any field where data transfer or storage places a premium on speed and storage efficiency. Modern applications for data compression include but is not limited with the fields of voice, picture, video, and encryption applications. Since we have moved from the Analog to the Digital age, data compression and decompression is being used in cable and wireless data communications, general communications, encrypted data storage, and encrypted communication to compress and decompress data.
As early as 1836, Samuel Morse invented the Morse code, which is a communication coding system where the letters of the alphabet are represented by binary codes, digits of which consist of dots and dashes. Morse code is considered to be one of the fundamentals of today's data compression theories. As an example, the letter “E” which is the most frequently used vowel in English is represented with a single dot whereas the letter “T” which is the most frequently used consonant in English is represented with a single dash.
Data compression studies continued into the twentieth century when Professor Claude Shannon and Robert Fano published an article in 1948 entitled “A mathematical Communication Theory”, disclosing a data compression mathematical algorithm known as the Shannon-Fano technique.
The second half of the twentieth century saw leaps in data compression, in each decade. In 1952, David A. Huffman started exploring the data compression field as a Ph.D. student at the Massachusetts Institute of Technology (MIT). Huffman eventually described his data compression method in an article published in 1952. Huffman's method forms the basis for lossless data compression by using entropy coding. In this method, data elements used in a data chain are statistically counted. Then, the data chains are replaced by codes that are assigned to data elements by frequency of repetition. In other words, those parts in the data chain that repeat the most are represented by shorter codes while those parts that seldom repeat are represented by longer codes. These shorter and longer codes are combined together to produce a code, which is shorter than the original data string and can be converted back to the original data string.
As time progressed, data compression algorithms moved from using specialized hardware to being more general purposed. By the late 1970s most files stored within a network were stored using data compression algorithms that employed Huffman coding. However, another advance in data compression occurred when Lempel and Jakob Ziv disclosed visual based coding in 1977. The algorithm developed by Lempel and Jakob Ziv is called “LZW”, and the LZW algorithm was used in most general-purpose data compression applications. This technique is still used in data compression applications such as PKZIP and other modern applications.
By the end of the 1980's, several data compression standards for digital video existed. In the early 1990's, video data compression algorithms existed, but with low resolution and/or low color fidelity. The lower resolution and lower color fidelity was a result of losing part of the data chain during compression. Some examples of the current video compression algorithms are listed as: 1) FAX CCITT 3 which uses Huffman coding; 2) GIF (LZW) and JPEG which use Cosine Transformation (which results in data loss and is complemented by Huffman or arithmetic coding); 3) BMP which uses working length coding; and 4) TIFF algorithms that are used in black and white fax facsimile machines.
Presently, data compression for a given application can be accomplished either by using known methods by themselves or by combining methods via cascading of various methods. When used alone, a data compression method has to be applied to a specific field in order to produce the best results. For example, a data compression method that may be optimum for storage files may not be a good method for audio compression. Likewise a data compression method that produces satisfactory data compression results for video data may not be suitable for compressing storage files. Different data compression methods are usually used in combination and repeatedly for different applications to increase the efficiency of data compression and to ensure optimum compression across a variety of fields of technology.