Digital data compression is a technology experiencing accentuated interest in recent years. In part, this is a consequence of the growing use of personal computers and workstations with high resolution graphic display systems. The volume of digital data used to represent the video information, as well as the speed with which that data must be compressed and decompressed, in the course of storage, transmission or physical rendering has motivated significant investigation in the technologies relating to data compression. In the data storage applications, the use of data compression increases the effective storage capacity of hard disks and CD drive systems, and is particularly important for compact non-volatile solid state storage devices such as PCMCIA cards. In the area of networks and communications, the transmission of data in compressed form increases the effective bandwidth of the media. In the field of physical rendering, hardware constraints often require that images be stored for access as complete units. Laser printers are an example of where data compression allows the use of smaller buffer memories to store full images.
A variety of lossless data compression methods and systems exist. The selection of a preferred approach is not always clear, in that one must consider the data compression ratio, the robustness of the method across different data types, the complexity of the effective algorithm, whether the technique is amenable to hardware or software implementation, and the effective speeds of encoding and decoding when compared to the needs of the application. Symmetry or asymmetry in the compression and decompression of the method chosen is particularly important, in that it should match the operational capabilities of the system. For example, if the compression is being accomplished in the context of a hard disk drive, and the read operation from the hard disk drive is performed at four times the rate of the write operation, then the decompression associated with the selected compression technique should ideally be correspondingly faster.
A data compression algorithm which has proven to be quite popular was first described in an article entitled "A Universal Algorithm for Sequential Data Compression" by authors Lempel and Ziv, as appeared in the IEEE Transactions on Information Theory, Volume IT-23, No. 3, pp. 337-343, 1977, generally referred to as the LZ-1 data compression algorithm. The LZ-1 algorithm has been refined in various respects by subsequent investigators, examples being the variances described in U.S. Pat. Nos. 5,003,307 and 5,146,221, the subject matter of which is incorporated by reference herein.
The fundamental concepts which characterize these and other versions of the basic LZ-1 algorithm involve the use of a buffer to store received data and to identify matches between newly received strings of data and previously received and processed strings of data. Thereby, new strings of data, typically sequences of bytes representing alpha numeric characters which match preceeding strings can be encoded simply by reference to these prior strings, using just location and length data incorporated into what are commonly known as tokens. The LZ-1 algorithm is dynamic in that new data is entered into the buffer which stores the earlier data after the comparison and encoding of the new data is completed. The size of the buffer is analogous to a sliding window over a data stream in which the new data characters are always compared to previously received characters within the length of the window. The encoded output is either a raw/literal token, indicating no compression, or a compressed/string token, the latter providing a length and an offset identifying the previously existing matching character string within the window. In many cases, the algorithm is increasingly effective as the size of the window increases and the repetition rate of the patterns in the data characters within the window increases.
More recent refinements directed toward increasing the speed of compression of the LZ-1 algorithm are described in the aforementioned patent applications. For example, U.S. patent application Ser. No. 08/290,451, the subject matter of which is incorporated by reference herein, relates to a use of a content addressable memory (CAM) to accelerate the comparison of new data with previously received strings of data.
Another technique, using a character history bit pattern memory to accelerate the compression operation, is described in U.S. patent application Ser. No. 08/173,738, the subject matter of which is incorporated by reference herein. The character history bit pattern memory implementation utilizes marker bits within the memory to identify matches between new data and previously received data. The speed of data compression is further accelerated through the use of the apparatus described in U.S. patent application Ser. No. 08/355,865, the subject matter of which is incorporated herein by reference, where the shifting operations accomplished with reference to the character history bit pattern buffer are rearranged for faster software implemented shifting. The focus of all these refinements to the compression side of the LZ-1 technique are attributable to the fact that the LZ-1 technique is highly asymmetric, in that decompression is very easy and fast, in software or hardware, as compared to compression. Thus, when the speed and efficiency of compression is improved the LZ-1 technique provides tremendous potential in the overall compression and decompression operation. The LZ-1 method is already highly desirable in that the asymmetry matches the operational characteristics of hard disk drives and CD drives.
Though the present invention has broad applicability, an area of particular and motivating concern is in small laser printers. Printer manufacturers constantly strive to differentiate their products from the competition in three main areas, these being price, resolution end printing speed. A fundamental attribute of all laser printers is the xerographic process, an image formation process which is best not interrupted. The data for the complete image to be physically rendered by the printer must be available during the brief discharge interval of each page printing cycle. Therefore, such printers require that the bit map for the full page be resident in memory for uninterrupted flow to the discharge photoconductor. Uncompressed, each pixel position of the image requires a single bit of memory. For contemporary 600 dot per inch (dpi) printers this means a memory of approximately 34M bits for each uncompressed page stored. With a trend toward 1200 dpi printer resolutions, that memory requirement increases by a factor of 4.
Given the tremendous price competition in this industry, and the trend to higher resolutions and printing speed, as reflected in the number of pages that must be stored, there has evolved a specific need for apparatus and methods which efficiently and quickly compress and decompress bit mapped images of the type used with printers. With the trend toward color rendering, the data volume per pixel position will obviously increase in proportion to the color resolution.
Though the asymmetry of the LZ-1 technique makes it particularly desirable from a decompression speed standpoint, investigation has shown that it does not efficiently compress binary bit mapped data characterizing images of the types conventionally printed in commercial applications. Therefore, what is needed, are systems and methods which exhibit the high compression efficiency and decompression simplicity of the LZ-1 algorithm implemented so as to efficiently compress bit mapped type image data.