1. Field of the Invention
The present invention relates to a method, system and program for error recovery while decoding compressed data.
2. Description of the Related Art
Digital images may use one or more bits to describe the color intensity at each pixel. The term “pixel” as used herein refers to one or more intensity inputs or bit values at a data point that represents data to be rendered (i.e., printed, displayed, etc.), where the data to be rendered may include, but is not limited to, images, text, composite images, graphs, collages, scientific data, video, etc. A pel is a picture element point that may be expressed with one bit. If only one bit is used to express the intensity, then the image is a bilevel image where there are two possible intensity values per pixel, such as black and white or full saturation and no intensity. Digital monochrome images that allow for more than two intensities per pixel express the intensities as shades of grey.
Most systems compress image data before transmitting the data to an output device, such as a printer or display, that renders the image data. The output device must decode or decompress the compressed image to output to print or otherwise render. Compressed images may also be archived and then at some later time transmitted to an output device for decompression and rendering, e.g., printing or displaying. For instance, an Adaptive Bi-Level Image Compression (ABIC) algorithm of the prior art would sequentially encode each bit of image data by using the seven nearest neighbor bits and a probability distribution that is calculated based on previously coded data. In current implementations, the ABIC decoder maintains a history group of bits comprising the last N+2 decoded bits. In certain current implementations, the ABIC decoder uses seven of the bits, including the last two decoded bits and bits in the history range from the (N−2) bit to the (N+2) bit. These are the seven nearest bits in the raster image. Details of using the ABIC algorithm to encode and decode data are described in the IBM publication entitled “A Multi-Purpose VLSI Chip for Adaptive Data Compression of Bilevel Images”, by R. B. Arps, T. K. Truong, D. J. Lu, R. C. Pasco, and T. D. Friedman, IBM J. Res. Develop., Vol. 32, No. 6, pgs. 775-795 (November 1988) and the commonly assigned U.S. Pat. No. 4,905,297, which publication and patent are incorporated herein by reference in their entirety.
If an error is encountered, the data used by the decoder to decompress the compressed data, including neighbor bits and a probability distribution, may be corrupted. To recover from an error, the decoder must begin decoding from a beginning point, such as the beginning of the current image being decompressed. Prior art encoding schemes also encode resynchronization data into the data stream to allow for decoding to begin at a resynchronization point. For instance, the compression schemes for Group 3 facsimile machines, including Modified Huffman (G3 MH) and Modified READ (G3 MR), which were finalized in the CCITT Study Group XIV in the late 1970s, encode end-of-line (EOL) codes into the data to allow resynchronization from the one-dimensional EOL points.
The G3 MH scheme independently codes horizontal runs of black or white pels alternated across a page. Every compressed line of the black/white facsimile image ended with an unique end-of-line (EOL) code consisting of at least 10 (or eleven) zeros followed by a one. No valid combination of run codes generated more than nine (or ten) zeros in a row. This EOL code allows for resynchronization after every compressed line. The two-dimensional G3 MR algorithm encodes each line with an EOL code followed by a tag bit specifying whether the next line was coded in one or two dimensions.
These early Group 3 digital facsimile machines had no error correction. The receiver could not request a retransmission. The receiver could resynchronize and recover from errors at the next one-dimensionally coded line. Because the standard size facsimile page had 1728 pels/line (i.e. 216 bytes/line) this synchronization occurred quite frequently. Further, there is no standardized technique for handling incorrect lines. Some machines print the bad data generating streaks across the page. Other machines skip the erroneous lines and output squished lines of text. Still other machines replicate the previous line in order to maintain consistent character height.
The CCITT Group 4 digital facsimile machines developed in the 1980s utilized the Modified Modified READ (G4 MMR) data compression algorithm. Instead of periodically coding lines one-dimensionally, the G3 two-dimensional coding scheme is used on every line without any EOLs. Since these machines were designed for use on the digital data networks, the transmission was expected to be error-free so error recovery resynchronization codes are not encoded into the data during compression.
The Joint Photographic Experts Group (JPEG) international data compression standard designed for continuous-tone (contone) pictures provides for optional resynchronization codes that may be encoded into the data. These resynchronization codes are defined as Restart Markers (RSTm 0xFFD0-0xFFD7) and can be used to separate independently coded blocks of data. The Define Restart Interval (DRI0xFFDD) marker specifies how many blocks are coded between Restart Markers. If Restart Markers are not encoded into the data, then decoding must restart at the beginning of the JPEG image, from the Start of Scan marker.
Thus, with all the above techniques, resynchronization codes are encoded into the actual compressed data to allow for error recovery while decoding at a point within the compressed data. Notwithstanding, there is a continued need in the art for improved techniques for allowing for error recovery during digital data transmissions.