The amount of information available via computers has dramatically increased with the wide spread proliferation of computer networks, the Internet and digital storage means. With such increased amount of information has come the need to transmit information quickly and to store the information efficiently. Data compression is a technology that facilitates the effective transmitting and storing of information
Data compression reduces an amount of space necessary to represent information, and can be used for many information types. The demand for compression of digital information, including images, text, audio and video has been ever increasing. Typically, data compression is used with standard computer systems; however, other technologies make use of data compression, such as but not limited to digital and satellite television as well as cellular/digital phones.
As the demand for handling, transmitting and processing large amounts of information increases, the demand for compression of such data increases as well. Although storage device capacity has increased significantly, the demand for information has outpaced capacity advancements. For example, an uncompressed digital picture can require 5 megabytes of space whereas the same picture can be compressed without loss and require only 2.5 megabytes of space. Thus, data compression facilitates transferring larger amounts of information. Even with the increase of transmission rates, such as broadband, DSL, cable modem Internet and the like, transmission limits are easily reached with uncompressed information. For example, transmission of an uncompressed image over a DSL line can take ten minutes. However, the same image can be transmitted in about one minute when compressed thus providing a ten-fold gain in data throughput.
In general, there are two types of compression, lossless and lossy. Lossless compression allows exact original data to be recovered after compression, while lossy compression allows for data recovered after compression to differ from the original data. A tradeoff exists between the two compression modes in that lossy compression provides for a better compression ratio than lossless compression because some degree of data integrity compromise is tolerated. Lossless compression may be used, for example, when compressing critical text, because failure to reconstruct exactly the data can dramatically affect quality and readability of the text. Lossy compression can be used with pictures or non-critical text where a certain amount of distortion or noise is either acceptable or imperceptible to human senses.
Bi-level images are quite common in digital document processing, because they offer the potential for a compact representation of black-and-white documents containing texts and drawings. In such images, their picture elements (pixels) can be seen as coming from a binary source (e.g., white=“0” and black=“1”). Since they usually contain a lot of white space and repeated ink patterns, one basic approach to efficiently encode such images is to scan them in raster order, e.g., from top to bottom and left to right, and encode each pixel via adaptive arithmetic coding (AC), whose state (or probability table) is controlled by a context formed by the values of the pixels in a small template enclosing previously encoded pixels. That idea is the basis of most modern bi-level image compression systems.
Facsimile images are usually transmitted using the old CCITT standards T.4 and T.6, which are usually referred to as Group 3 and Group 4 respectively. G3 usually encodes images with a modified Huffman (MH) code (i.e., Huffman coding on runs of black or white pixels), and G4 uses “modified modified read” (MMR) coding. MH and MMR are not as efficient as context-adaptive AC, but are simpler to implement. Over time, G3 and G4 evolved to include encoding via JBIG (joint bi-level image group, also known as recommendation T.82). JBIG uses the context-adaptive AC, with adaptive templates and the efficient QM binary arithmetic encoder. The JBIG-2 standard extends JBIG by including pattern matching for text and halftone data, as well as soft pattern matching (SPM) for lossy encoding. The JB2 encoder is also based on SPM, but uses the Z-coder for binary encoding. JBIG, JBIG-2 and JB2 can provide a significant improvement in compression performance over G4.