The present invention relates to digital compression/decompression algorithms for digital data sequences, and more particularly to an adaptive spatio-temporal compression/decompression algorithm for video image signals using a spatio-temporal quadtree based encoder and a corresponding decoder.
Often video images are made up of large uniform areas with a few detailed areas in which the image is uniform only locally. To represent this image digitally a two-dimensional array is sampled by taking an intensity or color sample at regularly spaced intervals known as picture elements, or pixels. The value for each pixel is quantized to eight bits, or for color images three values are quantized to eight bits, one each for red, blue and green, for a total of twenty-four (24) bits. The sampled video color image may also be represented by a luminance component and two chrominance components having one-half the bandwidth of the luminance component. In this case the size of the sampled chrominance fields is one-half the size of the sampled luminance field because of the reduced bandwidth and only sixteen bits per pixel are required, eight for the luminance component and eight for the associated one of the chrominance components. The data rate transmission capability for sending such image data in realtime over transmission lines is on the order of up to 160 Mbits/sec for NTSC video and up to 2 Gbits/sec for HDTV video. For facsimile transmission or teleconferencing where ordinary telephone transmission lines are used, the available data rate is approximately 2-10 kbits/sec. Even where the video image is to be stored, current magnetic disc technology is limited to data rates of approximately 8 Mbits/sec. Therefore some mechanism is required for reducing the amount of data without reducing picture content to achieve realtime video rates for storage and playback of video sequences, as well as for facsimile transmission or teleconferencing.
Since a video image is largely static from frame to frame, a technique is required for representing the large uniform areas in a compressed form with only differences from frame to frame being transmitted after the initial frame has been developed. Such a technique is described by Strobach et al of Siemens A. G., Munchen, Germany in an article entitled "Space-variant Regular Decomposition Quadtrees in Adaptive Interframe Coding", Proceedings of the International Conference on Acoustics, Speech and Signal Processing, ICASSP-88, Vol. II, pp. 1096-1099. Strobach et al describe a scene adaptive coder that is based on a quadtree mean decomposition of the motion compensated frame-to-frame difference signal followed by a scalar quantization and variable wordlength encoding of the local sample. The displacement vector is determined in a sense such that the resulting quadtree decomposition attains a minimum in terms of a minimum number of bits required for encoding. This is referred to as quadtree structured difference pulse code modulation (QSDPCM).
The quadtree is a hierarchical data structure for describing two-dimensional regions, and is often used to store binary pictures. Strobach et al use the bottom-up realization where four adjacent subblocks are tested to see if they are homogeneous with respect to the property of interest. If the test is positive, the four subblocks are merged into a new subblock which has four times the size of its immediate predecessor. The procedure is repeated recursively until the largest possible block size is reached. The merge test is conducted by comparing the sample mean of each adjacent subblock with the merged (larger) subblock, and performing the merge if the difference between the sample mean for the merged subblock and each adjacent subblock is less than a predetermined quality threshold. This technique provides for direct motion compensated prediction (MCP) error coding with smaller computational complexity than conventional motion compensated transform coders.
The Strobach et al interframe approach, however, outputs only averaged frame difference sample values, i.e., provides only spatial difference values. The Strobach et al decoder, therefore, requires an additional adder.
What is desired is a compression/decompression scheme for video image signals that is based upon an adaptive spatio-temporal algorithm to produce a simple decoder.