The amount of information available via computers has dramatically increased with the wide spread proliferation of computer networks, the Internet and digital storage means. With such increased amount of information has come the need to transmit information quickly and to store the information efficiently. Data compression is a technology that facilitates effectively transmitting and storing of information
Data compression reduces an amount of space necessary to represent information, and can be used for many information types. The demand for compression of digital information, including images, text, audio and video has been ever increasing. Typically, data compression is used with standard computer systems; however, other technologies make use of data compression, such as but not limited to digital and satellite television as well as cellular/digital phones.
As the demand for handling, transmitting and processing large amounts of information increases, the demand for compression of such data increases as well. Although storage device capacity has increased significantly, the demand for information has outpaced capacity advancements. For example, an uncompressed image can require 5 megabytes of space whereas the same image can be compressed and require only 2.5 megabytes of space. Thus, data compression facilitates transferring larger amounts information. Even with the increase of transmission rates, such as broadband, DSL, cable modem Internet and the like, transmission limits are easily reached with uncompressed information. For example, transmission of an uncompressed image over a DSL line can take ten minutes. However, the same image can be transmitted in about one minute when compressed thus providing a ten-fold gain in data throughput.
In general, there are two types of compression, lossless and lossy. Lossless compression allows exact original data to be recovered after compression, while lossy compression allows for data recovered after compression to differ from the original data. A tradeoff exists between the two compression modes in that lossy compression provides for a better compression ratio than lossless compression because some degree of data loss is tolerated. Lossless compression may be used, for example, when compressing critical text, because failure to reconstruct exactly the data can dramatically affect the quality and readability of text. Lossy compression can be used with images or non-critical text where a certain amount of distortion or noise is either acceptable or imperceptible to human senses.
Data compression is especially applicable to digital representations of documents (digital documents). Typically, digital documents include text, images and/or text and images. In addition to using less storage space for current digital data, compact storage without significant degradation of quality would encourage digitization of current hardcopies of documents making paperless offices more feasible. Striving toward such paperless offices is a goal for many businesses because paperless offices provide benefits, such as allowing easy access to information, reducing environmental costs, reducing storage costs and the like. Furthermore, decreasing file sizes of digital documents through compression permits more efficient use of Internet bandwidth, thus allowing for faster transmission of more information and a reduction of network congestion. Reducing required storage for information, movement toward efficient paperless offices, and increasing Internet bandwidth efficiency are just some of many significant benefits associated with compression technology.
Compression of digital documents should satisfy certain goals in order to make use of digital documents more attractive. First, the compression should enable compressing and decompressing large amounts of information in a small amount of time. Secondly, the compression should provide for accurately reproducing the digital document.
One important aspect of compression of digital documents is compression of color bitmaps, for example, when an image of a document is generated by scanning a page of a printed catalog. A typical application is electronic archival of catalog pages, for example. Such pages usually contain a mixture of content, such as color picture, text on flat-color background or text superimposed on textures or pictures. Although any color picture bitmap compression technique such as JPEG can be used, better reconstruction quality can be obtained by segmenting the original image bitmap into layers and compressing each layer separately.