Motion video signals typically contain a significant amount of intraframe or spatial redundancy as well as interframe or temporal redundancy. Video compression techniques take advantage of this spatial and temporal redundancy to significantly reduce the amount of data bandwidth required to process, transmit and store video signals. MPEG-2 is a well-known video compression standard developed by the International Standards Organization (ISO) Moving Picture Experts Group (MPEG) and documented in "Information Technology Generic Coding of Moving Pictures and Associated Audio Information: Video," ISO/IEC DIS 13818-2 (Video), which is incorporated herein by reference. MPEG-2 video compression involves both spatial and temporal compression of video frames or fields.
An exemplary MPEG-2 video encoder receives a sequence of video frames or fields from a video source such as a video camera or a telecine machine. The sequence of frames may be progressive or interlaced. A progressive sequence may have a frame rate of 30 frames per second. An interlaced sequence generally includes two fields for each frame, with a top field corresponding to even numbered lines and a bottom field corresponding to odd numbered lines. An interlaced sequence at a frame rate of 30 frames per second will therefore have a field rate of 60 fields per second. The frames in the video sequence may be converted to SIF or CCIR-601 resolution images made up of a plurality of adjacent macroblocks, with each macroblock including, for example, four 8.times.8 blocks of luminance pixels and two 8.times.8 blocks of chrominance pixels.
Spatial compression is applied to each of the macroblocks using the techniques of transform encoding, quantization, scanning, run-amplitude encoding and variable length coding. Transform encoding involves applying a discrete cosine transform (DCT) to each 8.times.8 block of pixels in a given macroblock to thereby generate an 8.times.8 block of DCT coefficients. The DCT coefficients are then quantized by dividing each coefficient by a quantizer step size which is the product of a weighting matrix element and a quantization scale factor selected for the given macroblock. The human eye is generally more sensitive to the lower frequency coefficients than the higher frequency coefficients. As such, the quantization step size is varied depending on the frequency of the coefficient that is quantized so that the low frequency coefficients can be mapped to a larger selection of values than the high frequency coefficients. The resulting quantized coefficients are scanned using a zig-zag scanning process which tends to aggregate zero-amplitude quantized coefficients. The resulting sequence can then be divided into subsequences each including a run of zero quantized coefficients followed by a single non-zero quantized coefficient. The subsequences are then run-amplitude encoded to produce a pair of numbers corresponding to the number of zero coefficients in the run and the amplitude of the single non-zero coefficient following the run. The run-amplitude pairs thus formed are then variable length encoded using a predetermined table which assigns a codeword to each anticipated run-amplitude pair.
Temporal compression is applied to a given macroblock using the techniques of motion estimation and motion compensation. The macroblock to be encoded is also referred to as a target macroblock or simply a target block, while the frame or field containing the target block is referred to as a target frame or target field, respectively. The motion estimation process makes use of a search window in the reference frame. The search window generally specifies the portion of the reference frame which will be searched in order to locate a macroblock which best matches a given target macroblock. A block matching algorithm is used to identify the reference frame macroblock within the specified search window which best matches the target macroblock. The identified reference frame macroblock is referred to as a predictive macroblock. A motion vector is then generated to indicate a translation between the pixel coordinates of the predictive macroblock and the target macroblock. Motion compensation involves generating a prediction error macroblock as the difference between the predictive macroblock and the target macroblock. The prediction error macroblock may then be spatially encoded as described above. The motion vector may be variable length encoded and outputted with the spatially encoded prediction error macroblock. For bidirectionally-predictive (B) frames, a bidirectionally-predictive macroblock is generated by interpolating between a predictive macroblock from a previous reference frame and a predictive macroblock from a subsequent reference frame. Two motion vectors are generated to indicate translations between the pixel coordinates of the previous and subsequent predictive macroblocks and the target macroblock. A prediction error macroblock is generated as the difference between the bidirectionally predictive macroblock and the target macroblock. The prediction error macroblock and motion vectors are then encoded as in the general case previously described.
Video preprocessing techniques are applied prior to performing the MPEG-2 spatial and temporal compression operations described above. An exemplary video preprocessor processes the video signal so that it may be more efficiently compressed by subsequent video compression circuitry. For example, the preprocessor may alter the format of each frame in terms of the number of horizontal or vertical pixels in order to meet parameters specified by the video compression circuitry. In addition, a preprocessor can detect scene changes or other image variations which increase compression difficulty. A scene change generally increases the amount of bits required because predictive encoding cannot initially be used. If the preprocessor detects a scene change, this information may be communicated by the preprocessor to the video compression circuitry. A fade, representing a continuous decrease or increase in luminance level to or from black over several frames, can also cause difficulties for the video compression circuitry because it can cause a failure in motion compensated prediction. The preprocessor can detect and inform the video compression circuitry of a fade so that the compression circuitry can take appropriate precautions.
Conventional video preprocessing techniques have generally been concerned with detecting and correcting obvious problematic situations such as the above-noted format alterations, scene changes and fades. However, these conventional preprocessing techniques have not addressed the possibility of preprocessing a video signal to optimize the quality of a displayed signal after compression and decompression. For example, it has not been heretofore suggested that a preprocessing technique which results in some actual degradation of signal quality prior to compression may in fact produce a higher quality displayed signal after compression and decompression. Conventional preprocessing also fails to provide efficient techniques for selectively filtering portions of a given video image, and for performing motion detection, edge detection and other image analysis operations on interlaced video images.
As is apparent from the above, a need exists for improved video preprocessing techniques which can be performed prior to compression but which optimize the quality of a displayed video signal after compression and decompression. The techniques should provide simple and effective selective filtering of video images, as well as improved motion detection, edge detection and other image analysis operations.