1. Technical Field
The invention relates to the decompression for playback of video segments on a data processing system. More particularly, the invention relates to a system and a method of decompressing video data while enabling scaling of frame resolution and color depth for playback of a video segment by the playback platform.
2. Description of the Related Art
A video signal comprises a sequence of frames, which when displayed at a given minimum frame rate (e.g., 15 to 30 frames per second in a personal computer), simulate the appearance of motion to a human observer. In a personal computer system, each frame of the video image comprises a matrix of picture elements or "pixels." A common image matrix has 320 columns by 240 rows of pixels. A pixel is the minimum unit of the picture which may be assigned a luminance intensity, and in color video, a color. Depending upon the data format used, as many as three bytes of data can be used to define visual information for a pixel. A pixel by pixel color description of all pixels for an entire frame can require over two hundred thousand bytes of data. Spatial resolution of an image is increased by increases in the number of pixels.
To display a video segment, if such full frames were replaced at a frame rate of 30 frames per second, a computer could be required to recover from storage and write to video memory as many as 27 million bytes of data each second. Few contemporary mass data storage devices have both the bandwidth required to pass such quantities of data or the storage capacity to hold more than a few minutes worth of digital video information directly stored. As used here, bandwidth means the volume of data per unit time which can be recovered from an auxiliary storage device. Data compression is used to accommodate auxiliary storage devices in the storage and recovery of video segments for playback in real time and to reduce traffic on the system bus.
Data compression allows an image or video segment to be transmitted and stored in substantially fewer bytes of data than required for full frame reproduction. Data compression can be based on eliminating redundant information from frame to frame in a digitized video segment (temporal compression), or by eliminating redundant information from pixel to pixel in individual frames (spatial compression). In addition, compression may exploit superior human perception of luminance intensity detail over color detail by averaging color over a block of pixels while preserving luminance detail.
Frame differencing compression methods exploit the temporal redundancy that exists between digital video frames from the same scene recorded moments apart in time. This reduces the required data needed to encode each frame. Two successive frames from a sequence of digital motion video frames are compared region by region. The comparison process determines whether two corresponding regions are the same or different. The size and location of each region, and the nature of the comparison are outside the scope of this invention.
Before temporal redundancy can exist, one frame necessarily represents a point in time after another frame. If the field of view of the frames is unchanged, then the regions from a frame at period N do not need to be encoded and stored if the regions in a frame at period N-1 are already known. When change has occurred, the changed regions of the later frame must be encoded and stored. When each region of two frames have been compared, and changed regions of the later period encoded and stored, the process moves to the next pair of frames. During playback, the decompression process adds the stored information for each period to the current state of the display memory using a process that is the logical reverse of the encoding process. This is called conditional replenishment.
When there is very little temporal redundancy in a digital motion video the method fails. However, in a motion video sequence of a flower growing, shot at 30 frames per second, frames contain a great deal of temporal redundancy and compress well using frame differencing. Similarly a sequence recorded through a moving camera will contain little redundancy and not compress well, assuming motion compensation algorithms are not employed.
While compression makes it possible to store and reproduce video segments on personal computers, the quantities of data involved and the computational load imposed on the system central processor still tax the capacity of many contemporary personal computers, particularly low end machines based on the Intel 8086/88 family of microprocessors. Large capacity machines designed for multitasking of applications and having advanced video adaptors have an easier time handling video segments, unless two or more video segments are required to be simultaneously reproduced.
A way of providing portability of decompressed data between machines of different capacity is to introduce resolution and color depth scalability. Resolution scalability allows the playback platform to change the number of pixels in an output image. A common display resolution is a matrix of 320.times.240 pixels. Other resolutions include 640.times.480 pixels and 160.times.120 pixels. Color depth scaling is used to reduce (or increase) the number of shades of color displayed. Use of such scaling would be enhanced were portable processes available to recognize and decode compressed video segments set up to support such scaling by the playback platform itself.
Three methods have been employed to support resolution and color depth scaling. All three techniques are targeted for use in transmission channel applications, where compressed video information can be transmitted and reconstructed progressively.
One technique involves use of image hierarchies. Each frame of a video segment is compressed at a plurality of spatial resolutions. Each level represents a different level of compression obtained by subsampling of 2 by 2 pixel regions of the next higher level of resolution. The resolution of the base level is the same as the raw data frame. Frame resolution scaling is obtained by selecting a particular level of resolution at the decompression platform. Time differential compression between frames is still done, but by selecting low resolution levels, low powered computers can decompress the stream.
A second technique known in the art is called bit-plane scalability. Here each frame in the video segment is compressed by encoding the bit-planes of the color information independently. Each bit-plane has the same spatial resolution as the original image frame. Each compressed video frame is organized from the most-significant bit (MSB) planes to the least-significant bit (LSB) planes. Color scalability is obtained by only decompressing and displaying the compressed higher order bit-planes of each frame.
A third technique known in the art is called subband coding. In subband coding, an image is decomposed into different frequency bands, downsampling the spatial resolution of each produced band, and compressing each frequency subband with a suitable compression technique independently (e.g. vector quantization). Scalability is obtained by only decompressing, upsampling and displaying the compresssed low-passed frequency bands in lower end machines, and progressively decoding higher and higher-passed frequency bands in progressively more powerful machines.