Very few techniques have been proposed to date to address the problem of detecting the size and position of coding blocks in possibly rescaled video images. One prior art method has been disclosed in International Patent WO 2006/010276 A1 entitled Apparatus and Method for Adaptive 3D Artifact Reducing for Encoded Image Signal issued to Chon Tam Le Dihn et al. (“Le Dihn”). The Le Dihn patent discloses a method for block offset detection on coded video images. The Le Dihn algorithm detects the block offset position for the eight by eight (8×8) DCT (Discrete Cosine Transform) block boundaries with (1) a boundary pixel detection method for the identification of possible block boundary pixels and (2) a histogram analysis for the calculation of offset position.
The Le Dihn patent addresses the problem of block offset detection on non-rescaled sequences. The mask and test are designed for non-rescaled boundaries and are thus less effective on rescaled boundaries. In addition, the Le Dihn method applies only to luminance data and does not consider factors such as frame/field macroblock coding type and interlace handling, which can cause the detected result to be inaccurate under certain circumstances. Furthermore, the Le Dihn method does not address the fundamental problem of block size detection.
U.S. Pat. No. 6,636,645 entitled Image Processing Method for Reducing Noise and Blocking Artifact in a Digital Image issued to Qing Yu et al. (“Yu”) introduces a similar algorithm for JPEG (Joint Photographic Experts Group) block offset detection. Instead of a second order derivative, the Yu algorithm uses a first order derivative, the gradient along the horizontal direction and vertical direction, to detect the boundary offsets. The gradient values of the columns that are separated by one block size are summed and the position of the maximum value is assumed to derive the horizontal offset. The Yu algorithm uses a dynamic threshold obtained from the average gradient value of the image to filter out the edges from the block boundaries.
The Yu algorithm also uses the detection result as an indicator to image blockiness. If the offsets from different channels of the same JPEG image do not agree with each other, the Yu algorithm will conclude that the blockiness is not obvious and thus no filtering is needed. The Yu algorithm assumes fixed block size and only addresses offset detection for non-rescaled images.
Block size detection is essentially a problem of repeated pattern detection. This means that reference to other very different applications may be useful. An example of such an application is described in a paper entitled “A Parallel-Line Detection Algorithm Based on HMM Decoding” by Yefeng Zheng, Huiping Li and David Doermann published in IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 27, Number 5, May 2005. Zheng, Li and Doermann proposed a complete algorithm to extract form lines from scanned form images, including a text filter, a Hidden Markov Model based line detection block, and a gap estimation block. While the line extraction and the text filter part are not relevant to the context of block boundary detection, the line gap estimation block assumes that the distance between the form lines are fixed and uses an autocorrelation based method to estimate the distance, which is similar to the application of block size detection. However, with standard integer autocorrelation, the problem of non-integer block size is not addressed. There is also no means to determine the detection accuracy.
Another similar application is found in the audio domain in a paper entitled “An Autocorrelation Pitch Detector and Voicing Decision with Confidence Measures Developed for Noise-Corrupted Speech” published by David A. Krubsack and Russell J. Niederjohn in IEEE Transactions of Signal Processing, Volume 39, Number 2, pp. 319-329, February 1991. In the Krubsack-Niederjohn {“KN”) algorithm the speech signal is divided into segments of fifty one and two tenths milliseconds (51.2 msec) and the autocorrelation function is used to pick the pitch. The KN algorithm starts with choosing the maximum peak in the human pitch range of fifty Hertz (50 Hz) to three hundred thirty three Hertz (333 Hz). The KN algorithm further checks pitch values at the submultiples of the dominant peak to reduce wrong period errors.
In addition, the KN algorithm provides a confidence measure for the detected result based on parameters extracted from the autocorrelation process. The KN algorithm does not address the problem of non-integer block sizes. Moreover, due to the different nature of an audio signal with respect to a video signal, the parameters that are extracted and used to determine pitch detection confidence are not suitable for defining the confidence measure in block boundary detection.
The quantization in a video compression process produces video artifacts on the decoded video images. Typical coding noises include blocking noise and ringing noise. Blocking noise is caused by different quantization levels across DCT (Discrete Cosine Transform) blocks and is characterized by 8×8 block patterns formed by sharp horizontal and vertical edges. There are several causes of ringing noise, which manifest as sinusoids near strong edges. Similar to blocking noise, ringing noise is also related to quantization levels, which differs from one DCT block to another. This explains why many noise reduction algorithms perform local analysis and filtering on an 8×8 block basis.
While prior art noise reduction algorithms work effectively on non-rescaled images, they will not work for rescaled sequences. For example, a deblocking algorithm that is designed to work on 8×8 blocks will pick the wrong boundary positions to apply the filtering process, not only causing the deblocking to be ineffective, but also introducing potential artifacts. Considering that many SD (Standard Definition) video inputs are rescaled to HD (High Definition) before playback on today's HDTV (High Definition Television) sets, a block boundary detector which reports the size and position of the coding blocks becomes necessary for effective noise reduction.
Knowledge of the boundary position is not only critical for deblocking algorithms, whose very purpose is to remove the boundaries, but is also useful for generic algorithms designed to detect or remove noises caused by block-based encoding. Knowledge of the boundary position is also helpful in determining the image quality and the resealing factor.
There is presently no existing solution that addresses the specific problem of coding block boundary detection. Although some solutions exist for the detection of boundary offset in non-rescaled cases, these solutions will not work on rescaled sequences.
The challenges of block boundary detection for arbitrary input lie not only in the size detection methodology, but also in block boundary pixel identification. While original block boundaries are sharp one-pixel edges that can be easily picked out with generic edge detection techniques, rescaled boundaries are much more challenging to detect. First, this is due to the blurring of the boundaries caused by the interpolation process. Depending on the resealing method and the resealing factor, the one-pixel block boundaries may spread to two to four pixels. Second, the resealing process may also be coupled with other processing such as detinterlacing, interlacing, and 4:2:0 to 4:2:2 conversion, all of which further blur the boundaries. As a result, the originally sharp boundaries in a PAL (Phase Alternate Line) sequence may be difficult to identify and detect after the sequence is rescaled to 1080i.
Another major challenge is non-integer block size detection. Since the standard autocorrelation process only detects integer size, previously there has been no means to detect any decimal block size. One possible method for converting non-integer block size detection to integer block size detection is to 1:100 interpolate the sequence (for a precision of 0.01) and then apply an integer autocorrelation process. This method, however, is neither efficient nor effective.
Another challenge with block size detection is temporal stability. For certain noise reduction algorithms, block size will have a great impact on the filtering strength. An abrupt change in filtering strength might cause notable inconsistency in the output video due to different filtering applied to neighborhood frames. Therefore the block boundary detector must ensure the stability of the output under all scenarios, even if the input is a clean sequence.
To remedy the deficiencies of the above identified prior art methods there is a need in the art for an improved method for detecting block boundaries in video image processing.