Conventionally, in accordance with an international standard video encoding method, such as MPEG or ITU-T H.26x, after an inputted video frame is partitioned into macroblocks each of which consists of blocks of 16×16 pixels and a motion-compensated prediction is carried out on each of the macroblocks, information compression is carried out on the inputted video frame by carrying out orthogonal transformation and quantization on a prediction error signal on a per block basis. A problem is, however, that as the compression ratio becomes high, the compression efficiency is reduced because of degradation in the quality of a prediction reference image used when carrying out a motion-compensated prediction. To solve this problem, in accordance with an encoding method such as MPEG-4 AVC/H.264 (refer to nonpatent reference 1), by carrying out an in-loop deblocking filtering process, a block distortion occurring in a prediction reference image and caused by quantization of orthogonal transform coefficients is eliminated.
FIG. 21 is a block diagram showing a video encoding device disclosed in nonpatent reference 1. In this video encoding device, when receiving an image signal which is a target to be encoded, a block partitioning unit 101 partitions the image signal into macroblocks and outputs an image signal of each of the macroblocks to a prediction unit 102 as a partitioned image signal. When receiving the partitioned image signal from the block partitioning unit 101, the prediction unit 102 carries out an intra-frame or inter-frame prediction on the image signal of each color component in each of the macroblocks to determine a prediction error signal
Particularly when carrying out a motion-compensated prediction between frames, a search for a motion vector is performed on each macroblock itself or each of subblocks into which each macroblock is further partitioned finely. Then, a motion-compensated prediction image is generated by carrying out a motion-compensated prediction on a reference image signal stored in a memory 107 by using the motion vector, and a prediction error signal is calculated by determining the difference between a prediction signal showing the motion-compensated prediction image and the partitioned image signal. Further, the prediction unit 102 outputs parameters for prediction signal generation which the prediction unit determines when acquiring the prediction signal to a variable length encoding unit 108. For example, the parameters for prediction signal generation includes an intra prediction mode indicating how a spatial prediction is carried out within a frame, and a motion vector indicating an amount of motion between frames.
When receiving the prediction error signal from the prediction unit 102, a compressing unit 103 removes a signal correlation by carrying out a DCT (discrete cosine transform) process on the prediction error signal, and then quantizes this prediction error signal to acquire compressed data. When receiving the compressed data from the compressing unit 103, a local decoding unit 104 calculates a prediction error signal corresponding to the prediction error signal outputted from the prediction unit 102 by inverse-quantizing the compressed data and then carrying out an inverse DCT process on the compressed data.
When receiving the prediction error signal from the local decoding unit 104, an adding unit 105 adds the prediction error signal and the prediction signal outputted from the prediction unit 102 to generate a local decoded image. A loop filter 106 eliminates a block distortion piggybacked onto a local decoded image signal showing the local decoded image generated by the adding unit 105, and stores the local decoded image signal from which the distortion is eliminated in a memory 107 as a reference image signal.
When receiving the compressed data from the compressing unit 103, a variable length encoding unit 108 entropy-encodes the compressed data and outputs a bitstream which is the encoded result. When outputting the bitstream, the variable length encoding unit 108 multiplexes the parameters for prediction signal generation outputted from the prediction unit 102 into the bitstream and outputs this bitstream.
In accordance with the method disclosed by nonpatent reference 1, the loop filter 106 determines a smoothing intensity for a neighboring pixel at a block boundary in DCT on the basis of information including the granularity of the quantization, the coding mode, the degree of variation in the motion vector, etc., thereby reducing distortions occurring at block boundaries. As a result, the quality of the reference image signal can be improved and the efficiency of the motion-compensated prediction in subsequent encoding processes can be improved.
In contrast, a problem with the method disclosed by nonpatent reference 1 is that the amount of high frequency components lost from the signal increases with increase in the compression rate, and this results in excessive smoothness in the entire screen and hence the video image becomes blurred. In order to solve this problem, nonpatent reference 2 proposes, as a loop filter 106, an adaptive offset process (pixel adaptive offset process) of partitioning a screen into a plurality of blocks, carrying out a class classification on each pixel within each of the blocks into which the screen is partitioned, and adding an offset value which minimizes a squared error distortion between an image signal which is an original image signal and which is a target to be encoded and a reference image signal corresponding to the image signal for each class.