Typically, after digitizing a video signal input from outside, a video encoding device performs an encoding process conforming to a predetermined video encoding scheme, to create encoded data, i.e. a bitstream.
As the predetermined video encoding scheme, ISO/IEC 14496-10 Advanced Video Coding (AVC) described in Non Patent Literature (NPL) 1 is available. As a reference model of an AVC encoder, a Joint Model scheme is known (hereafter referred to as a typical video encoding device).
A structure and an operation of the typical video encoding device which receives each frame of digitized video as input and outputs a bitstream are described below, with reference to FIG. 20.
As shown in FIG. 20, the typical video encoding device includes an MB buffer 101, a frequency transform unit 102, a quantization unit 103, an entropy encoder 104, an inverse quantization unit 105, an inverse frequency transform unit 106, a picture buffer 107, a distortion removal filter unit 108a, a decode picture buffer 109, an intra prediction unit 110, an inter-frame prediction unit 111, an encoding control unit 112, and a switch 100.
The typical video encoding device divides each frame into blocks of 16×16 pixels in size called macroblocks (MBs), and further divides each MB into blocks of 4×4 pixels in size, where each 4×4 block obtained as a result of the division is a minimum unit of encoding.
FIG. 21 is an explanatory diagram showing an example of block division in the case where each frame has a spatial resolution of QCIF (Quarter Common Intermediate Format). The following describes an operation of each unit shown in FIG. 20 by focusing only on pixel values of luminance, for simplicity's sake.
The MB buffer 101 stores pixel values of an MB to be encoded in an input image frame. The MB to be encoded is hereafter referred to as an input MB.
A prediction signal supplied from the intra prediction unit 110 or the inter-frame prediction unit 111 via the switch 100 is subtracted from the input MB supplied from the MB buffer 101. The input MB from which the prediction signal has been subtracted is hereafter referred to as a prediction error image block.
The intra prediction unit 110 creates an intra prediction signal, using a reconstructed image that is stored in the picture buffer 107 and has the same display time as the current frame. An MB encoded using the intra prediction signal is hereafter referred to as an intra MB.
The inter-frame prediction unit 111 creates an inter-frame prediction signal, using a reference image that is stored in the decode picture buffer 109 and has a different display time from the current frame. An MB encoded using the inter-frame prediction signal is hereafter referred to as an inter MB.
A frame encoded including only intra MBs is called an I frame. A frame encoded including not only intra MBs but also inter MBs is called a P frame. A frame encoded including inter MBs for which not only one reference image but two reference images are simultaneously used for inter-frame prediction signal creation is called a B frame.
The encoding control unit 112 compares each of the intra prediction signal and the inter-frame prediction signal with the input MB stored in the MB buffer 101, selects a prediction signal corresponding to smaller energy of the prediction error image block, and controls the switch 100 accordingly. Information about the selected prediction signal (intra prediction mode, intra prediction direction, and inter-frame prediction-related information) is supplied to the entropy encoder 104.
The encoding control unit 112 also selects a basis block size of integer DCT (Discrete Cosine Transform) suitable for frequency transform of the prediction error image block, based on the input MB or the prediction error image block. The integer DCT means frequency transform by a basis obtained by approximating a DCT basis by an integer in the typical video encoding device. The basis block size is selectable from three block sizes that are 16×16, 8×8, and 4×4. A larger basis block size is selected when the input MB or the prediction error image block has flatter pixel values. Information about the selected integer DCT basis size is supplied to the frequency transform unit 102 and the entropy encoder 104. The information about the selected prediction signal, the information about the selected integer DCT basis size and the like, and a quantization parameter described later are hereafter referred to as auxiliary information.
The encoding control unit 112 further monitors the number of bits of a bitstream output from the entropy encoder 104, in order to encode the frame with not more than a target number of bits. The encoding control unit 112 outputs a quantization parameter for increasing a quantization step size if the number of bits of the output bitstream is more than the target number of bits, and outputs a quantization parameter for decreasing the quantization step size if the number of bits of the output bitstream is less than the target number of bits. Encoding is thus performed so that the output bitstream approaches the target number of bits.
The frequency transform unit 102 frequency-transforms the prediction error image block with the selected integer DCT basis size, from a spatial domain to a frequency domain. The prediction error transformed to the frequency domain is referred to as a transform coefficient.
The quantization unit 103 quantizes the transform coefficient with the quantization step size corresponding to the quantization parameter supplied from the encoding control unit 112. A quantization index of the quantized transform coefficient is also called a level.
The entropy encoder 104 entropy-encodes the auxiliary information and the quantization index, and outputs the resulting sequence of bits, i.e. the bitstream.
The inverse quantization unit 105 and the inverse frequency transform unit 106 inverse-quantizes the quantization index supplied from the quantization unit 103 to obtain a quantization representative value and further inverse-frequency-transforms the quantization representative value to return it to the original spatial domain, for subsequent encoding. The prediction error image block returned to the original spatial domain is hereafter referred to as a reconstructed prediction error image block.
The picture buffer 107 stores a reconstructed image block obtained by adding the prediction signal to the reconstructed prediction error image block, until all MBs included in the current frame are encoded. A picture composed of a reconstructed image in the picture buffer 107 is hereafter referred to as a reconstructed image picture.
The distortion removal filter unit 108a applies filtering to boundaries of each MB of the reconstructed image and internal blocks of the MB, thereby performing a process of removing distortions (block distortions and banding distortions) for the reconstructed image stored in the picture buffer 107.
FIGS. 22 and 23 are each an explanatory diagram for describing the operation of the distortion removal filter unit 108a. 
The distortion removal filter unit 108a applies filtering to horizontal block boundaries of the MB and internal blocks of the MB, as shown in FIG. 22. The distortion removal filter unit 108a also applies filtering to vertical block boundaries of the MB and internal blocks of the MB, as shown in FIG. 23. The horizontal block boundaries are left block boundaries of 4×4 blocks 0, 4, 8, and 12, left block boundaries of 4×4 blocks 1, 5, 9, and 13, left block boundaries of 4×4 blocks 2, 6, 10, and 14, and left block boundaries of 4×4 blocks 3, 7, 11, and 15. The vertical block boundaries are upper block boundaries of 4×4 blocks 0, 1, 2, and 3, upper block boundaries of 4×4 blocks 4, 5, 6, and 7, upper block boundaries of 4×4 blocks 8, 9, 10, and 11, and upper block boundaries of 4×4 blocks 12, 13, 14, and 15.
Note that, in the case where the integer DCT of 8×8 block size is used for the MB, only the left block boundaries of the 4×4 blocks 0, 4, 8, and 12, the left block boundaries of the 4×4 blocks 2, 6, 10, and 14, the upper block boundaries of the 4×4 blocks 0, 1, 2, and 3, and the upper block boundaries of the 4×4 blocks 8, 9, 10, and 11 are block boundaries subjected to distortion removal. In the case where the basis of the integer DCT of 16×16 block size is a basis obtained by approximating the basis of the DCT of 16×16 block size by an integer and the integer DOT of 16×16 block size is used for the MB, only the left block boundaries of the 4×4 blocks 0, 4, 8, and 12 and the upper block boundaries of the 4×4 blocks 0, 1, 2, and 3 are block boundaries subjected to distortion removal.
Regarding the filtering process for each horizontal block boundary, pre-filtering pixels on the left side of the block boundary are denoted by p3, p2, p1, and p0, post-filtering pixels on the left side of the block boundary by P3, P2, P1, and P0, pre-filtering pixels on the right side of the block boundary by q0, q1, q2, and q3, and post-filtering pixels on the right side of the block boundary by Q0, Q1, Q2, and Q3.
Regarding the filtering process for each vertical block boundary, pre-filtering pixels on the upper side of the block boundary are denoted by p3, p2, p1, and p0, post-filtering pixels on the upper side of the block boundary by P3, P2, P1, and P0, pre-filtering pixels on the lower side of the block boundary by q0, q1, q2, and q3, and post-filtering pixels on the lower side of the block boundary by Q0, Q1, Q2, and Q3.
It is assumed that P3, P2, P1, P0, Q0, Q1, Q2, and Q3 are initialized respectively to p3, p2, p1, p0, q0, q1, q2, and q3.
The filtering process for the block boundary is the same between the horizontal direction and the vertical direction. Accordingly, the following description of the filtering process for the block boundary is made without particularly distinguishing between the horizontal direction and the vertical direction. FIG. 24 shows an internal structure of the distortion removal filter unit 108a. 
In the distortion removal filter unit 108a shown in FIG. 24, first a block boundary strength determination unit 1081 determines a block boundary strength bS (0≦bS≦4) based on auxiliary information of an adjacent block, with reference to 8.7 Deblocking filter process in NPL 1. FIG. 25 is a flowchart showing a process of determining bS.
In the case where any of the pixel p0 and the pixel q0 at the block boundary is a pixel of an intra MB (step S101), the block boundary strength determination unit 1081 determines whether or not the pixel p0 and the pixel q0 are pixels on both sides of an MB boundary (step S102). In the case where the pixel p0 and the pixel q0 are the pixels on both sides of the MB boundary, the block boundary strength determination unit 1081 determines bS as 4. In the case where the pixel p0 and the pixel q0 are not the pixels on both sides of the MB boundary, the block boundary strength determination unit 1081 determines bS as 3.
In the case where none of the pixel p0 and the pixel q0 is a pixel of an intra MB, the block boundary strength determination unit 1081 determines whether or not a quantization index is present in any of blocks to which the pixel p0 and the pixel q0 respectively belong (step S103). In the case where the quantization index is present in any of the blocks to which the pixel p0 and the pixel q0 respectively belong, the block boundary strength determination unit 1081 determines bS as 2. In the case where the quantization index is not present in any of the blocks to which the pixel p0 and the pixel q0 respectively belong, the block boundary strength determination unit 1081 determines whether or not inter-frame prediction is discontinuous between the pixel p0 and the pixel q0 (step S104). In the case where the inter-frame prediction is discontinuous, the block boundary strength determination unit 1081 determines bS as 1. In the case where the inter-frame prediction is not discontinuous, the block boundary strength determination unit 1081 determines bS as 0.
The process of determining bS is described in more detail in 8.7.2 Filtering process for a set of samples across a horizontal or vertical block edge in NPL 1.
When bS is larger, it is determined that the block boundary has a larger amount of change, and stronger filtering is applied. No filtering is applied when bS=0.
The following describes a filtering process using pseudo random noise in NPL 2 based on NPL 1 for a block boundary limited to bS>0, separately for the case where bS=4 and the case where bS<4.
In the case where bS=4, for each edge of pos (0≦pos≦16) of a row (in horizontal filtering) or a column (in vertical filtering) of the block boundary to be processed, an edge determination unit 1082 determines an edge where |p0−q0|<α/4 and |p1−p0|<β, as an edge to be filtered. A filter unit 1083 calculates P0, P1, and P2 by the following equations that use pseudo random noise ditherP[pos] (1≦ditherP[pos]≦7) corresponding to pos.P0=(p2+2*p1+2*p0+2*q0+q1+ditherP[pos])/8  (1)P1=(p3+2*p2+2*p1+2*p0+q0+ditherP[pos])/8  (2)P2=(2*p3+3*p2+p1+p0+q0+ditherP[pos])/8  (3)
Here, α and β are each a parameter that is larger when a quantization parameter Q is larger, and pos is a position corresponding to coordinates of the block position to be processed.
Likewise, in the case where bS=4, for each edge of pos (0≦pos≦16) of a row (in horizontal filtering) or a column (in vertical filtering) of the block boundary to be processed, the edge determination unit 1082 determines an edge where |p0−q0|<α/4 and |q1−q0|<β, as an edge to be filtered. The filter unit 1083 calculates Q0, Q1, and Q2 by the following equations that use pseudo random noise ditherQ[pos] (1≦ditherQ[pos]≦7) corresponding to pos.Q0=(q2+2*q1+2*q0+2*p0+p1+ditherQ[pos])/8  (4)Q1=(q3+2*q2+2*q1+2*q0+p0+ditherQ[pos])/8  (5)Q2=(2*q3+3*q2+q1+q0+p0+ditherQ[pos])/8  (6)
By injecting pseudo random noise to the block boundary as shown by Equations (1) to (6), not only block distortions are removed but also banding distortions are made visually unnoticeable.
In the case where bS<4, for each edge of pos (0≦pos≦16) of a row (in horizontal filtering) or a column (in vertical filtering) of the block boundary to be processed, the edge determination unit 1082 determines an edge where |p0−p2|<β, as an edge to be filtered. The filter unit 1083 calculates P0 by the following equation.P0=p0+Clip3{−tc,tc,(2*(q0−p0)+p1−q1+4)/8}  (7)
Here, tc is a parameter that is larger when bS and the quantization parameter Q are larger.
Likewise, in the case where bS<4, for each edge of pos (0≦pos≦16) of a row (in horizontal filtering) or a column (in vertical filtering) of the block boundary to be processed, the edge determination unit 1082 determines an edge where |q0−q2|<β, as an edge to be filtered. The filter unit 1083 calculates Q0 by the following equation.Q0=q0−Clip3{−tc,tc,(2*(q0−p0)+p1−q1+4)/8}  (8)
The decode picture buffer 109 stores a distortion-removed reconstructed image picture supplied from the distortion removal filter unit 108a, from which block distortions and ringing distortions have been removed, as a reference image picture. An image of the reference image picture is used as a reference image for creating the inter-frame prediction signal.
The video encoding device shown in FIG. 20 creates the bitstream through the processing described above.