I. Field of the Invention
In general, the present invention relates to differential pulse code modulation (DPCM) data compression. In particular, the invention relates to apparatus and method for modelling picture element ("pixel") data for entropy encoding.
II. Prior and Contemporaneous Technology
Oftentimes, information relating to a recorded image is to be stored or communicated. For example, in teleconferencing, successive graylevel images are to be sent rapidly and with clarity over communication links to viewers at a distant location. In the banking industry, information on bank checks is to be stored so that it may be retrieved at some later time. In facsimile and related environments, text and/or graphic images are to be communicated from one location to another.
In these various environments, the image is normally converted into a coded form so that it may be retained in limited storage and/or may be conveyed rapidly.
In the digital coding process, it is well-known to define the image at a given instant as a plurality of picture elements (referred to as "pixels" or "pels"), each of which represents a particular portion of the image. Accordingly, the image may be viewed as m lines of n pixels/line. Collectively, the lines of pixels represent the image.
Each pixel, it is noted, has a corresponding graylevel--or darkness level. One way of coding the information contained in an image is to scan the pixels line-by-line and identify the graylevel for each pixel. For example, suppose the upper left pixel is identified as X.sub.1,1 where the first subscript corresponds to the line number and the second subscript corresponds to the pixel in the line. The second pixel in the first line is then X.sub.1,2. If there are 480 lines and 512 pixels/line, an image for a given instant can be represented by information gathered by scanning the 480.times.512 pixels.
Each pixel typically has a graylevel corresponding thereto, ranging between a black value (e.g., 0) and a white value (e.g., 255). That is, given 8 bits, the graylevel of a pixel can have any of 256 values. To represent an image, the image may be scanned in a prescribed manner with the values of one pixel after another being recorded. For example, proceeding line-by-line, an image can be represented by the successively recorded values of pixels X.sub.1,1, X.sub.1,2, . . . , X.sub.480,512.
In some instances, a top-to-bottom scan of the image is referred to as a "field" and a plurality of fields are interlaced to form a "frame". For example, one field may comprise the odd-numbered lines which are scanned first and a second field may comprise the even-numbered lines which are scanned thereafter. The two fields together form a single "frame".
The above straightforward approach results in a large number of bits required for each image to be recorded. The large number of bits can make the storing and/or rapid conveying of data impractical where storage space is limited or rapid data transfer is required.
To address the problem of reducing the number of required bits, a number of data compression techniques have been taught.
One technique of data compression is referred to as "entropy coding". In entropy coding, the number of bits used in representing events is intended to be inversely related to event probability. More probable events are represented by codewords characterized by a relatively short length (of bits) whereas less probable events are represented by relatively longer lengths.
To perform entropy coding, an entropy coder typically receives two inputs. The first input is a decision and the second input is a state input which provides a context for the decision input. For example, a binary decision input may represent a heads or tails event for a coin toss; or an ON or OFF condition for a switch; or a 1 or 0 value of a bit in a string. The state input may--usually based on history, theory, or estimate--provide some contextual index which suggests how the decision input is to be processed. For example, in an image in which a pixel may be either black or white, different neighborhoods of the pixel may have different likelihoods of the pixel therein being white. That is, each neighborhood has a respective estimated black-white probability ratio associated therewith. Hence, to provide meaning to the decision input, a state input is furnished to reflect the neighborhood corresponding to the decision input. Based on the state input, the entropy coder transforms the decision input into a codeword of appropriate length.
The state input to the entropy coder is the result of modelling, i.e. defining the contexts under which codewords are assigned to decisions. A well-known example is taught in the literature as Markov states. The efficiency of the entropy encoder depends on the quality of the modelling--that is, how well the state input to the entropy coder represents variations in the probability of the decision input.
The correct assignment of codeword lengths is dictated by information theory concepts and is based on the estimated probability of occurrence of the events. The better the probability estimate, the more efficient the codeword length assignment, and the better the compression.
One example of an entropy coder is described in detail in co-pending patent applications:
"ARITHMETIC CODING DATA COMPRESSION/DE-COMPRESSION SELECTIVELY EMPLOYED, DIVERSE ARITHMETIC CODING ENCODERS AND DECODERS", invented by J. L. Mitchell and W. B. Pennebaker, U.S. Ser. No. 06/907,700; "PROBABILITY ESTIMATION BASED ON DECISION HISTORY", invented by J. L. Mitchell and W. B. Pennebaker, U.S. Ser. No. 06/907,695; and "ARITHMETIC CODING ENCODER AND DECODER SYSTEM" (Q-coder), invented by G. G. Langdon, Jr., J. L. Mitchell, W. B. Pennebaker and J. J. Rissanen, U.S. Ser. No. 06/907,714.
The invention disclosed in the above-cited co-pending patent applications were invented by the present inventors and co-workers thereof at the IBM Corporation; said applications being incorporated herein by reference for their teachings involving entropy coding, or more specifically arithmetic coding and adaptive probability estimation.
Other entropy coders include Huffmann coding coders and Elias coding coders. Numerous publications describe such coding approaches.
Another technique used in data compression is referred to as "Differential Pulse Code Modulation" (DPCM). According to basic DPCM teachings, a predicted value based on one or more neighboring pixel values is determined for a "subject" pixel--i.e., a pixel whose informational content is currently being coded. The difference between the value for the subject pixel and the predicted value is then used as a basis for subsequent coding. Where there is high correlation between nearby pixels, using the difference value rather than the actual measured value can result in significant compression. Typically, a factor-of-two compression can be achieved by using DPCM techniques to obtain reasonably good quality pictures.