1. Field of the Invention
This invention relates to low-bit rate digital transmission of moving images such as in videophone communications over conventional phone lines.
2. Description of Related Art
PSTN Phone lines are commonly used for digital communications between computers or other devices using modems and standard communications protocols. Such communication protocols have bit rates limited by the quality of transmission over the phone lines. For example, the V.FAST standard has a bit rate of between 16.8 and 28.8 kbit/s depending on the quality of the connection. These bit rates are low when compared to the requirements for transmitting high quality digital moving images.
Conventional digital moving images are a succession of frames (or still images) which are commonly represented by two-dimensional arrays of pixel values. The pixel values indicate colors or intensities of pixels in the frames. Transmission of pixel values by videophones is impractical because of the large amount of data required to transmit every pixel value in every frame of a moving picture. Accordingly, typical videophone systems contain an encoding circuit which converts a succession of frames (two-dimensional arrays) into a compressed representation of the moving image.
FIG. 1 shows a video encoding portion of a videophone which generates a compressed representation of a moving image. An input video signal VIDEO.sub.-- IN from a video camera or other video source indicates pixel values in frames of the moving image. An information extracting circuit 110 converts the representation of the moving image to a format where redundant information is more easily removed. For example, difference-frame coding subtracts pixel values from a current frame from pixel values in a preceding frame and extracts non-zero values which indicate changes between the frames. Redundant data, that are repeated in successive frames, appear as zero values in the difference, and the large number of zero values can be effectively compressed or removed.
Motion estimation reduces the number of non-zero values in the difference between frames by subtracting a block having a position indicated by a motion vector in a preceding frame from a block in a current frame. The motion vector is selected to reduce the difference. Additionally, blocks may be transformed using transformations such as a discrete cosine transformation (DCT) or a Walsh transformation. Such transformations provide a block of coefficients having few non-zero values and long runs of zero values. Typically, information extracting circuit 110 quantizes the transformed blocks and converts the transformed block into a series of symbols. A symbol may indicate, for example, a non-zero value in the block and a number of consecutive, zero values in a run preceding the non-zero value.
A coding block 120 codes symbols for transmission. One well known variable length coding technique, commonly referred to as Huffman coding, matches each symbol to a Huffman code in a table. The table is constructed so that the most common symbols have the Huffman codes with the fewest bits so that on average coding block 120 reduces the number of bits required to express a string of symbols.
A problem in prior art Huffman coding is that the bitstream representing a moving image commonly contains symbols indicating motion vectors, symbols indicating image parameters, and symbols indicating the start of a frame or block. Coding block 120 typically does not efficiently code these symbols because the values of the symbols have statistical distributions that differ from the statistical distributions of symbols representing difference blocks. Further, although Huffman coding reduces the number of bits in a description of a moving image, more efficient symbol coding would allow further improvements image quality at low bit rates.
Arithmetic coding is known in fields other than moving image coding. Arithmetic coding, instead of encoding individual symbols as does Huffman coding, encodes strings of symbols. FIG. 2 illustrates an example of arithmetic coding. In the example, the set of possible values for each symbol is {a, e, i, o, u, !}, and the string encoded is "eaii!". Symbol "!" indicates the end of a string. An arithmetic coding model for the symbol divides an interval into segments which have lengths proportional to the probability of the symbol occurring in a string. In FIG. 2, the probabilities for symbols a, e, i, o, u, and ! are 20%, 30%, 10%, 20%, 10%, and 10%; and the model divides the interval [0,1) into segments [0,0.2), [0.2,0.5), [0.5,0.6), [0.6,0.8), [0.8,0.9), and [0.9,1.0) which correspond to symbols a, e, i, o, u, and ! respectively.
To determine a code value within interval [0,1), the interval [0,1) is divided according to the model, and a segment [0.2,0.5) corresponding to symbol e, the first symbol in string eaii!, is selected. The selected segment [0.2,0.5) is divided according to the model into segments [0.2,0.26), [0.26,0.35), [0.35,0.38), [0.38,0.44), [0.44,0.47), and [0.47,0.5) corresponding to symbols a, e, i, o, u, ! respectively, and the segment [0.2,0.26) corresponding to symbol a, the second symbol of "eaii!", is selected. Partitioning and selecting of segments is repeated for the remaining symbols i, i, and ! and results in a final selection of the interval [0.23354,0.2336) which corresponds to string "eaii!". Arithmetic coding selects from the selected final interval a code value that requires the minimum number of bits to express. For the example string of FIG. 2, code value 0.233581542 is 0.0011101111011 binary.
The prior art has not provided an efficient way employ arithmetic coding in moving image encoding where strings containing differing kinds of symbols have different ranges of possible values and different statistics.