1. Technical Field
The invention relates to the compression for storage and decompression for playback of video segments in a data processing system. More particularly, the invention relates to a system and a method of compressing video information in such a way that the playback platform can select a frame rate for playback in real time, thus allowing portability of video segments between machines of differing processing capacity. Still more particularly, the invention relates to a system and method for providing random access to the compressed video signal while allowing frame rate scalability.
2. Description of the Related Art
A video signal comprises a sequence of frames, which when displayed at a given minimum frame rate (e.g., 15 to 30 frames-per-second in a personal computer), simulate the appearance of motion to a human observer. In a personal computer system, each frame of the video image comprises a matrix of picture elements or "pixels." A typical matrix may have 320 columns by 240 rows of pixels. A pixel is the minimum unit of the picture which may be assigned a luminance intensity, and in color video, a color. Depending upon the data format used, as many as three bytes of data can be used to define visual information for a pixel. A complete color description of all pixels for an entire frame can require over two hundred thousand bytes of data.
For a video segment, if full frames were replaced at a frame rate of 30 frames per second, a computer could be required to recover from storage and write to video memory as much as 27 million bytes of data each second. Few contemporary mass data storage devices have both the bandwidth required to pass such quantities of data or the storage capacity to hold more than a few minutes worth of digital video information directly stored. As used here, bandwidth means the volume of data per unit time which can be recovered from an auxiliary storage device. Data compression is used to accommodate auxiliary storage devices in the storage and recovery of video segments for playback in real time and to reduce traffic on the system bus.
Data compression allows an image or video segment to be transmitted and stored in substantially fewer bytes of data than required for full frame reproduction. Data compression can be based on eliminating redundant information from frame to frame in a digitized video segment (temporal compression), or by eliminating redundant information from pixel to pixel in individual frames (spatial compression). These techniques can be implemented in a loss less or lossy manner. In addition, compression may exploit superior human perception of luminance intensity detail over color detail by averaging color over a block of pixels while preserving luminance detail. This is an example of a lossy compression technique.
Frame differencing compression methods exploit the temporal redundancy that exists between digital video frames from the same scene recorded moments apart in time. This reduces the required data needed to encode each frame. Two successive frames from a sequence of digital motion video frames are compared region by region. The comparison process determines whether two corresponding regions are the same or different. The size and location of each region, the exact nature of the comparison and the definition of same and different in terms of the threshold supplied are outside the scope of this invention.
Necessarily, one frame represents a point in time after another frame. If two regions being compared are the same, then the pixels in the regions from frame N do not need to be encoded and stored if the pixels in a frame N-1 are already known. When two regions are different, the pixels in the later frame must be encoded and stored. When each region of two frames have been compared, encoded and stored, the process moves to the next pair of frames. During playback, the decompression process adds the stored information for each period to the current state of the display memory using a process that is the logical reverse of the encoding process. This is called conditional replenishment.
When there is very little temporal redundancy in a digital motion video the method fails. However, a motion video sequence of a flower growing, shot at 30 frames per second, will contain a great deal of redundancy and will compress well using conditional replenishment. Similarly a sequence recorded through a moving camera will contain little redundancy and not compress well, assuming prior art motion compensation algorithms are not employed.
While compression makes it possible to store and reproduce video segments on personal computers, the quantities of data involved and the computational load imposed on system central processor still tax the capacity of many contemporary personal computers, particularly low end machines based on the Intel 8086/88 family of microprocessors. Large capacity machines designed for multitasking of applications and having advanced video adaptors have an easier time handling video segments until several tasks are applied to them, including demands that two or more video segments be simultaneously reproduced. It is critical to the presentation of video segments that such presentation be in real time. In the past, frame rates for a video segment have been selected with a particular playback platform in mind. To reproduce video on an 8086/88 based machine running at 10 Mhz, a frame rate of 5 or 6 fps might be the fastest rate supported. Higher capacity machines would be given versions of the video segment for reproduction at 30 fps. The two compressed sequences have not been portable between the machines because of the requirement that each frame be decompressed in sequence. Playback of the 30 fps sequence on an 8086/88 machine results in a slow motion display. While a 5 fps sequence can be reproduced on a higher capacity machine in real time, the reproduced images show no improvement in smoothness over the lower capacity machines.