The present invention relates to a video signal encoding method and system, and in particular to a video signal encoding method and system with motion compensated prediction.
A high-efficiency encoding system for use in encoding video signals employs a hybrid encoding system combining inter-picture prediction encoding utilizing motion compensation and intra-picture encoding.
FIG. 1 is a block diagram showing an encoding system utilizing a conventional hybrid encoding method described in ISO-IEC/JTC/SC29/WG11 MPEG 92/N0245 Test Model 2. As illustrated, a digital video signal 101 received at an input terminal 1 is supplied to a first input of a subtractor 10, a first input of a motion compensated prediction circuit 17, and a second input of a quantizer 12. The output of the subtractor 10 is supplied to a DCT (discrete cosine transform) circuit 11, and its output is supplied to a first input of the quantizer 12. The output 102 of the quantizer 12 is supplied to a first input of a variable-length encoder 19, and to an inverse quantizer 13, and its output is supplied to an IDCT (inverse discrete cosine transform) circuit 14, and its output is supplied to a first input of an adder 15. The output of the adder 15 is supplied to a memory 16, and data (reference image signal 103) read from the memory 16 is supplied to a second input of the motion compensated prediction circuit 17 and a first input of a selector 18. A first output 104 of the motion compensated prediction circuit 17 is supplied to the memory 16.
A zero signal (data representing a value xe2x80x9c0xe2x80x9d) is supplied to a second input of the selector 18, and a second output 107 of the motion compensated prediction circuit 17 is supplied to a third input of the selector 18. The output 106 of the selector 18 is supplied to a second input of the subtractor 10 and a second input of the adder 15. A third output 107 of the motion compensated prediction circuit 17 is supplied to a second input of the variable-length encoder 19. The output of the variable-length encoder 19 is input to a transmitting buffer 20, and a first output of the transmitting buffer 20 is output via an output terminal 2. A second output 108 of the transmitting buffer 20 is supplied to a third input of the quantizer 12.
FIG. 2 is a block diagram showing an example of configuration of a conventional motion compensated prediction circuit 17. The digital video signal 101 is supplied to a first input of a motion vector search circuit 3a. A reference image signal 103 input from the memory 16 is supplied to a second input of the motion vector search circuit 3a. The motion vector 109 output from the motion vector search circuit 3a is supplied to a first input of a selector 4a. A zero vector (xe2x80x9c0xe2x80x9d) is supplied to a second input of the selector 4a. 
The prediction image 110 output from the motion vector search circuit 3a is supplied to a first input of a distortion calculator 5a. Applied to a second input of the distortion calculator 5a is the video signal 101 from the input terminal 1. A distortion output 111 from the distortion calculator 5a is supplied to a first input of a comparing and selecting circuit 7a. 
The video signal 101 is also supplied to a first input of a distortion calculator 5b. The reference image signal 103 is also supplied to a second input of the distortion calculator 5b. A distortion output 112 from the distortion calculator 5b is supplied to a second input of the comparing and selecting circuit 7a. A selection mode 113 output from the comparing and selecting circuit 7a is supplied to a first input of a comparing and selecting circuit 7b. A distortion output 114 from the comparing and selecting circuit 7a is supplied to a second input of the comparing and selecting circuit 7b. 
The selection mode output 113 from the comparing and selecting circuit 7a is also supplied to a third input of the selector 4a. A motion vector 107 output from the selector 4a is supplied to the variable-length encoder 19.
The prediction image 110 output from the motion vector search circuit 3a is supplied to a first input of a selector 4b. The reference image signal 103 from the input terminal 1a is also supplied to a second input of the selector 4b. The selection mode 113 from the comparing and selecting circuit 7a is supplied to a third input of the selector 4b. 
The prediction image 104 from the selector 4b is supplied to the memory 16. The video signal 101 from the input terminal 1 is also input to a variance calculator 9. An output 115 of the variance calculator 9 is supplied to a third input of the comparing and selecting circuit 7b. The selection mode 105 from the comparing and selecting circuit 7b is supplied to the selector 18.
The operation is described next. The digital input signal 101 is supplied to the subtractor 10, where a difference between the input picture (frame or field) and the picture from the motion compensated prediction circuit 17 is taken to reduce the temporal redundancy (redundancy in the direction of the time axis), and DCT is performed in the directions of the spatial axes. Coefficient obtained are quantized, and variable-length encoded, and then transmitted via the transmitting buffer 20.
Motion compensated prediction is schematically illustrated in FIG. 3. The picture that is to be encoded is divided into matching blocks each consisting of 16 pixels by 16 lines. For each matching block, examination is made as to which part of the reference picture, if used as a prediction image, minimizes the distortion. For instance, in the case of a still picture, if the 16 pixels by 16 lines at the same position as the matching block are used as the prediction image, the distortion will be zero. In the case of a motion picture, it may be that the block shifted leftward by 8 pixels and downward by 17 lines for instance yields the minimum distortion. Then, this block at the shifted position is regarded as a block corresponding to the matching block in question, and used as the prediction image, and (xe2x88x928, 17) is transmitted as the motion vector.
Further explanation of the motion compensated prediction is explained with reference to FIG. 2. First, in the motion vector search circuit 3a, the motion vector is determined on the basis of the input image 101 and the reference image 103. This is effected by finding a block in the reference picture which minimizes the distortion for each matching block, as explained in connection with FIG. 3, and the the block thus found to give the minimum distortion is used as the prediction image, and the position of the block thus found to give the minimum distortion relative to the matching block is used as the motion vector. The distortion may be defined in terms of the sum of the absolute values of the differences.
In the distortion calculator 5a, the distortion defined as the sum of the squares of the differences between the input image 101 and the prediction image 110 output from the motion vector search circuit 3a is calculated for each matching block. The distortion 111 is also denoted by SEmc. In the distortion calculator 5b, the distortion defined as the sum of the squares of the differences between the input image 101 and the reference image 103 (of the same position) is calculated for each matching block. This distortion 112 is also denoted by SEnomc. The SEnomc is a particular value of the distortion SEmc where the vector representing the relative position between the input image 101 and the prediction image is zero.
For the purpose of the following explanation, it is assumed that the whole picture consists of I pixels by J lines, and the input picture is represented by F(i,j) where i represents the pixel number in the horizontal direction and 0xe2x89xa6i less than I, and j represents the pixel number in the vertical direction and 0xe2x89xa6j less than J. The matching blocks are so defined as not to overlap each other. Then, each matching block is represented by F(n*16+i, m*16+j) where 0xe2x89xa6ixe2x89xa615, and 0xe2x89xa6jxe2x89xa615, and (n, m) represent the position of the matching block ((n*16, m*16) represents the left, upper corner of the matching block). The (n, m)-th matching block is denoted by:
M(i,j)=F(n*16+i,m*16+j)(0xe2x89xa6ixe2x89xa615, 0xe2x89xa6jxe2x89xa615)xe2x80x83xe2x80x83(F1)
The reference image is represented by G(i,j) (0xe2x89xa6i less than I, 0xe2x89xa6j less than J), and the vector between the input image and the reference image is represented by (H,V), the prediction image PH,V(i,j) is given by:
PH,V(i,j)=G(n*16+i+H,m*16+j+V)xe2x80x83xe2x80x83(F2)
The distortion S is evaluated using the following evaluation function:                     S        =                              ∑                          i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                    ∑                              j                =                0                            15                        ⁢                          xe2x80x83                        ⁢                          "LeftBracketingBar"                                                                    M                    ⁡                                          (                                              i                        ,                        j                                            )                                                        -                  PH                                ,                                  V                  ⁡                                      (                                          i                      ,                      j                                        )                                                              "RightBracketingBar"                                                          (        F3        )            
The motion vector finding circuit 3a finds a vector (H, V) which minimizes the distortion S given by the above evaluation function (F3), and regards this vector H,V as the motion vector, and-outputs this motion vector (H,V) and the prediction image PH,V(i,j).
When SEmc less than SEnomc, the comparing and selecting circuit 7a outputs a signal 113 indicating motion compensation (MC) mode and the distortion SEmc (111). When SEmcxe2x89xa7SEnomc, the comparing and selecting circuit 7a outputs a signal 113 indicating no motion compensation (NOMC) mode and the distortion SEnomc (112). When the mode selected by the comparing and selecting circuit 7a is the MC mode, the selector 4a outputs the motion vector 109 selected by the motion vector search circuit 3a, and the selector 4b selects the prediction image 110 selected by the motion vector search circuit 3a. 
When the mode selected by the comparing and selecting circuit 7a is the NoMC mode, the selector 4a outputs the zero vector, and the selector 4b selects the reference image 103.
The variance calculator 9 calculates the variance of each matching block of the input image signal 101. The comparing and selecting circuit 7b compares the distortion 114 from the comparing and selecting circuit 7a and the variance 115 from the variance calculator 9, and selects the intra mode for intra-picture encoding, or a selection mode output from the comparing and selecting circuit 7a. 
The motion vector output from the motion compensated prediction circuit 17 is encoded at the variable-length encoder 19, an example of which is shown in FIG. 4. Referring to FIG. 4, the motion vector 107 output from the motion compensated prediction circuit 17 is supplied to a first input of a subtractor 30. An output of the subtractor 30 is input to the variable-length code selector 31, and supplied via a memory 32 to a first input of a selector 33. Applied to a second input of the selector 33 is a zero vector. The output 102 of the quantizer 12 is variable-length-encoded at an encoder 34. An output of the variable-length code selector 31 and an output of the encoder 34 are multiplexed at a multiplexer 35, and supplied to the transmitting buffer 20.
As shown in FIG. 4, a difference between the motion vector for each matching block and the motion vector for the preceding matching block is determined at the subtractor 30, and the variable-length code for the difference vector is output. When the current matching block is in the intra mode or the NoMC mode, the motion vector is not encoded. When the preceding matching block is in the intra mode or the NoMC mode, or in the initial state of the encoding, the zero vector is used in place of the preceding motion vector. The variable-length code representing the difference vector is assigned a shorter code when it is closer to the zero vector.
In the conventional motion compensated prediction for the image signal encoding, transfer efficiency of the motion vector is low. Moreover, the motion vector is selected depending on the magnitude of the predicted distortion, so that when similar patterns are present over a wide area of the picture, or where the picture is featureless and flat, the difference in the predicted distortion may be small, and a block different from a truly corresponding block may erroneously found as a corresponding block. If a block farther away from the truly corresponding block is found as a corresponding block, an unnecessarily large motion vector is transmitted, and the picture is distorted.
Another problem associated with the conventional system is that the motion vectors for adjacent blocks sometimes differ so much, causing picture quality degradation. Moreover, the selection of the vector depends on the magnitude of the distortion, and the efficiency of transmission of the motion vectors is low.
A further problem associated with the conventional system is that if the range of motion vector search is expanded the amount of information of the codes of the vectors is increased. If on the other hand the range of the motion vector search is narrowed rapid motion cannot be traced.
FIG. 5 is another way of presenting the conventional image signal encoding system shown in the previously mentioned publication, ISO-IEC/JTC1/SC29/WG11 MPEG 92/N0245 Test Model 2. Reference numerals identical to those in FIG. 1 denote identical or corresponding elements. The memory 16 and the selector 18 in FIG. 1 are not shown, but instead a memory 21 is added. The digital video signal 101a received at the input terminal 1 is input to and stored in the memory 21, and the video signal 101b read out of the memory 21 is supplied to the first input of the subtractor 10 and to the motion compensated prediction circuit 17. The output of the motion compensated prediction circuit 17 is supplied to the second input of the subtractor 10, and to the second input of the adder 15. The rest of the configuration is similar to that of FIG. 1.
FIG. 6 is a schematic diagram showing the concept of motion compensated prediction in the prior art image signal encoding system. FIG. 7 is a schematic diagram showing the operation of the memory 21.
FIG. 8 shows an example of the motion compensated prediction circuit 17 used in the system of FIG. 5. The output 103 of the adder 15 (FIG. 5) is supplied via an input terminal 21a to a switching circuit 23. A first output of the switching circuit 23 is supplied to a first frame memory 24a. a second output of the switching circuit 23 is supplied to a second frame memory 24b. Reference images stored in and read out from the frame memories 24a and 24b are respectively supplied to first inputs of motion vector detectors 25a and 25b. The reference image from the memory 21 is supplied via a second input terminal 21b to second inputs of the motion vector detectors 25a and 25b. Outputs of the motion vector detectors 25a and 25b are supplied to first and second inputs of a prediction mode selector 26. The reference image 101b from the memory 21 is supplied to a third input of the prediction mode selector 26. A first output of the prediction mode selector 26 is input to a first input of a selector 27, a zero vector (xe2x80x9c0xe2x80x9d) is supplied to a second input of the selector 27, and a second output of the prediction mode selector 26 is supplied to a third input of the selector 27. An output of the selector 27 is output via the output terminal 106.
Referring now to FIG. 6, the pictures are classified into intra-picture encoded picture (called I-picture), a one-way predictive-encoded picture (called P-picture), and a bi-directionally predictive-encoded picture (called B-picture). For instance, let us assume that it is desired that one out of every N pictures is an I-picture, and one out of M every pictures is a P-picture or an I-picture. If n and m are integers, and 1xe2x89xa6mxe2x89xa6N/M, then (N*n+M)-th pictures are made to be I-pictures, (N*n+M*m)-th pictures (mxe2x89xa01) are made to be P-pictures, and (N*n+M*m+1)-th to (N*n+M*m+Mxe2x88x921)-th pictures are made to be B-pictures. An assembly of (N*n+1)-th to (N*n+N)-th pictures are called a group of pictures or a GOP. FIG. 6 shows the case where N=15, and M=3.
With respect to the I-pictures, intra-picture encoding, without inter-picture prediction, is conducted. With respect to P-pictures, prediction from an immediately preceding I- or P-picture is conducted. For instance, the sixth picture in FIG. 6 is a P-picture, and is predicted from the third, P-picture. The ninth, P-picture is predicted from the sixth, P-picture. With respect to the B-pictures, prediction from both the preceding and succeeding I- and P-pictures is conducted. For instance, the fourth and fifth, B-pictures are predicted from the third, I-picture and the sixth, P-picture. Accordingly, the fourth and fifth pictures are encoded, after the sixth picture is encoded.
Next, the operation of the encoding system, shown in FIG. 5, using the hybrid encoding method will be described.
The input digital image signal input via the input terminal 1 is input to the memory 21, and rearranged into the order of the encoding, and output, as shown in FIG. 7, in which xe2x80x9cOIxe2x80x9d indicates the order of input, while xe2x80x9cOExe2x80x9d indicates the order of encoding. The order of the image signals is changed from that shown at the top of FIG. 7 into that shown at the bottom of FIG. 7. This is because, the first, B-picture in FIG. 6, for instance, cannot be encoded until after the third, I-picture is encoded, as described above.
The image signals 101b output from the memory 21 are supplied to the subtractor 10, where the difference between each image signal 101b and the prediction picture 106 from the motion compensated prediction circuit 17 is obtained, and the difference is subjected to DCT (discrete cosine transform) at the DCT circuit 11 in the direction of the time axis. The coefficients obtained by the DCT are quantized at the quantizer 12, and are then variable-length-encoded at the variable-length encoder 19, and output via the transmitting buffer 20.
The quantized transform coefficients are inverse-quantized at the inverse-quantizer 13, and are subjected to IDCT (inverse DCT) at the IDCT circuit 14, and are then added at the adder 15 to the prediction image 106 to produce a decoded image 103. The decoded image 103 is input to the motion compensated prediction circuit 17, for the purpose of encoding the next image.
The operation of the motion compensated prediction circuit 17 will next be described with reference to FIG. 8. The motion compensated prediction circuit 17 uses two reference images stored in the frame memories 24a and 24b, to perform motion compensated prediction using the image signal 101b, to produce the prediction image 106.
First, where the decoded image 103 is an I- or P-picture, the image 103 is written in the frame memory 24a or 24b for the encoding of the next picture. One of the frame memories 24a and 24b which was updated earlier is selected by the selector 23 for the writing of the newly input image 103. This means the frame memories 24a and 24b are selected alternately when a newly input image 103 is to be written. With such alternate selection, when the first and second, B-pictures in FIG. 6 are to be encoded, the zero-th, P-picture and the third, I-picture are stored in the frame memories 24a and 24b, respectively. When the sixth, P-picture is encoded and decoded, the frame memory 24a is updated with the decoded sixth, P-picture. Accordingly, when the fourth and fifth, B-pictures are to be encoded, the sixth, P-picture and third, I-picture are stored in the frame memories 24a and 24b, respectively. When the ninth, P-picture is encoded and decoded, the frame memory 24b is updated with the decoded ninth, P-picture. Accordingly, when the seventh and eighth, B-pictures are to be encoded, the sixth and ninth, P-pictures are stored in the frame memories 24a and 24b, respectively.
When the image signal 101b output from the memory 21 is input to the motion compensated prediction circuit 17, the two motion vector detectors 25a and 25b detect the motion vector using the reference pictures stored in the frame memories 24a and 24b, and outputs the motion compensated prediction picture.
That is, the image signal 101b for one picture is divided into a plurality of blocks, and for each block, one of the reference blocks which minimizes the prediction distortion is selected, and the relative position of the selected block is output as the motion vector, and the selected block is output as the motion compensated prediction image. The prediction mode selector 26 selects one of the two motion compensated prediction images from the motion vector detectors 25a and 25b and the average image thereof which gives the minimum prediction distortion, and outputs the selected image as the prediction image. If the image signal 101b is an I-picture or a P-picture, the motion compensated prediction image within the reference picture input earlier is selected and output. That is, where the image signal 101b is an I-picture or a P-picture, and if the reference image stored in the frame memory 24b is of the one earlier than the reference image stored in the frame memory 24a, the motion compensated prediction image from the motion vector detector 25b is selected and output. If the reference image stored in the frame memory 24a is of the one earlier than the reference image stored in the frame memory 24b, the motion compensated prediction image from the motion vector detector 25a is selected and output.
The prediction mode selector 26 also selects one of the intra-picture encoding (which does not use prediction), and the inter-picture prediction encoding using the selected prediction image which yields a higher encoding efficiency. If the image signal 101b is an I-picture, the intra-picture encoding is always selected. When the intra-picture encoding is selected, a signal indicating the intra-picture encoding is output as the prediction mode signal. When the inter-picture encoding is selected, a signal indicating the selected prediction image is output as the prediction mode signal. When the prediction mode output from the prediction mode selector 26 is an intra-picture encoding mode, the selector 27 outputs a zero signal (xe2x80x9c0xe2x80x9d). Otherwise, the selector 27 outputs the prediction image from the prediction mode selector 26.
Thus, it will be understood that when the image signal 101b output from the memory 21 is an I-picture, the motion compensated prediction circuit 17 outputs a zero signal as the prediction image 106, so that no inter-picture prediction is performed for the I-picture and intra-picture conversion encoding is conducted. When the image signal 101b output from the memory 21 is the sixth, P-picture, in FIG. 6, the motion compensated prediction circuit 17 performs motion compensated prediction from the third, I-picture in FIG. 6 to produce the prediction image 106. When the image signal 101b output from the memory 21 is the fourth, B-picture in FIG. 6, the motion compensated prediction circuit 17 performs motion compensated prediction from the third, I-picture and the sixth, P-picture in FIG. 6, to produce the prediction image 106.
Since the conventional image signal encoding system is configured as described above, even if the motion is 30 pixels per frame, if the P-picture interval M is three, the motion vector is of 90 pixels, and the motion vector search range must be wide. That is, the temporal distance between the pictures, in particular for the P-picture prediction, is long, and the motion vector range must be wide, and the hardware size is therefore large, and the amount of information of the motion vector codes is large. If the motion vector search range is narrow, the correct motion vector cannot be found, and the prediction efficiency is low, and the amount of information of the codes is enlarged, or the picture quality is degraded.
Moreover, the conventional image signal encoding system is configured as described does not take account of scene changes. If a scene change occurs at a P-picture or a B-picture, there will be no effects of the motion compensated prediction, so that the amount of information of the codes is enlarged or the picture quality is degraded.
Further problems of the prior art system will next be described. If the input image signal 101b is represented by F(i,j), with i representing the pixel number in the horizontal direction, and j representing the pixel number in the vertical direction, and the reference picture stored in the frame memory 24a is represented by G(i,j), and the whole picture is divided into blocks Bn,m(i,j), each including 16 pixels in the horizontal direction by 16 lines in the vertical direction, with n=0, 1, 2, . . . indicating the position of the block in the horizontal direction, and m=0, 1, 2, . . . indicating the position of the block in the vertical direction, and 0xe2x89xa6ixe2x89xa615, and 0xe2x89xa6jxe2x89xa615. The block is represented by:
Bn,m(i,j)=F(n*16+i, m*16+J)
For each block, one of the reference blocks which minimizes the prediction distortion is selected by means of block matching, and the relative position of the selected reference block is output as representing the motion vector, and the block is output as the motion compensated prediction image.
When the input image signal 101 is an interlace signal, and each frame is treated as one picture, the block matching is conducted for each frame and for each field, and the result of the block matching which yields a smaller prediction matching is selected. When the block matching is conducted for each frame, the prediction distortion E0(Vh,Vv) for the vector (Vh,Vv) is calculated by:                               E0          ⁡                      (                          Vh              ,              Vv                        )                          =                              ∑                          i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                    ∑                              j                =                0                            15                        ⁢                          xe2x80x83                        ⁢                          "LeftBracketingBar"                              Bn                ,                                                      m                    ⁡                                          (                                              i                        ,                        j                                            )                                                        -                                      G                    ⁡                                          (                                                                                                    n                            ⋆                            16                                                    +                          i                          +                          Vh                                                ,                                                                              m                            ⋆                            16                                                    +                          j                          +                          Vv                                                                    )                                                                                  "RightBracketingBar"                                                          (        F4        )            
If the motion vector search range is xc2x1Mh pixels in the horizontal direction and xc2x1Mv lines in the vertical direction, the vector (Vh,Vv)=(Vh0,Vv0 within xe2x88x92Mhxe2x89xa6Vhxe2x89xa6+Mh, and xe2x88x92Mvxe2x89xa6Vvxe2x89xa6+Mv, and giving the minimum E0(Vh,Vv) is determined, and e0 is defined as (written for) the E0(Vh0,Vv0) for the (Vh0, Vv0).
If the block matching is made for each field, the block Bn,m(i,j) is divided into first and second fields. For the first field of the block Bn,m(i,j), the prediction distortion E1(Vh,Vv,f) (f=0,1) for the vector (Vh,Vv) is calculated by:                               E1          ⁡                      (                          Vh              ,              Vv              ,              0                        )                          =                                                                              xe2x80x83                                "AutoRightMatch"                            ∑                                      i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                    ∑                              j                =                0                                            7                ⁢                                  xe2x80x83                                                      ⁢                          xe2x80x83                        ⁢                          "LeftBracketingBar"                              Bn                ,                                                      m                    ⁡                                          (                                              i                        ,                                                  2                          ⋆                          j                                                                    )                                                        -                                      G                    ⁡                                          (                                                                                                    n                            ⋆                            16                                                    +                          i                          +                          Vh                                                ,                                                                              m                            ⋆                            16                                                    +                                                      2                            ⋆                            j                                                    +                          Vv                                                                    )                                                                                  "RightBracketingBar"                                                          (        F5        )                                          E1          ⁡                      (                          Vh              ,              Vv              ,              1                        )                          =                                                                              xe2x80x83                                "AutoRightMatch"                            ∑                                      i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                    ∑                              j                =                0                                            7                ⁢                                  xe2x80x83                                                      ⁢                          xe2x80x83                        ⁢                          "LeftBracketingBar"                              Bn                ,                                                      m                    ⁡                                          (                                              i                        ,                                                  2                          ⋆                          j                                                                    )                                                        -                                      G                    ⁡                                          (                                                                                                    n                            ⋆                            16                                                    +                          i                          +                          Vh                                                ,                                                                              m                            ⋆                            16                                                    +                                                      2                            ⋆                            j                                                    +                          1                          +                          Vv                                                                    )                                                                                  "RightBracketingBar"                                                          (        F6        )            
If the motion vector search range is xc2x1Nh pixels in the horizontal direction and xc2x1Nv lines in the vertical direction, the vector (Vh,Vv)=(Vh1,Vv1) within xe2x88x92Nhxe2x89xa6Vhxe2x89xa6+Nh, and xe2x88x92Nvxe2x89xa6Vvxe2x89xa6+Nv, and f=f1 which give in combination the minimum E1(Vh,Vv,f) is determined, and e1 is defined as E1(Vh1,Vv1,f1). f indicates whether the reference image is of a first field or of a second field.
For the second field of the block Bn,m(i,j), the prediction distortion E2(Vh,Vv,f) (f=0,1) for the vector (Vh,Vv) is calculated by:                               E2          ⁡                      (                          Vh              ,              Vv              ,              0                        )                          =                                                                              xe2x80x83                                "AutoRightMatch"                            ∑                                      i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                    ∑                              j                =                0                                            7                ⁢                                  xe2x80x83                                                      ⁢                          xe2x80x83                        ⁢                          "LeftBracketingBar"                              Bn                ,                                                      m                    ⁡                                          (                                              i                        ,                                                                              2                            ⋆                            j                                                    +                          1                                                                    )                                                        -                                      G                    ⁡                                          (                                                                                                    n                            ⋆                            16                                                    +                          i                          +                          Vh                                                ,                                                                              m                            ⋆                            16                                                    +                                                      2                            ⋆                            j                                                    +                          Vv                                                                    )                                                                                  "RightBracketingBar"                                                          (        F7        )                                          E2          ⁡                      (                          Vh              ,              Vv              ,              1                        )                          =                                                                              xe2x80x83                                "AutoRightMatch"                            ∑                                      i              =              0                        15                    ⁢                      xe2x80x83                    ⁢                                                    ∑                                  j                  =                  0                                                  7                  ⁢                                      xe2x80x83                                                              ⁢                              xe2x80x83                            ⁢                              "LeftBracketingBar"                                  Bn                  ,                                      xe2x80x83                                    ⁢                                                            m                      ⁡                                              (                                                  i                          ,                                                                                    2                              ⋆                              j                                                        +                            1                                                                          )                                                              -                                          "AutoLeftMatch"                                              G                        ⁡                                                  (                                                                                                                    n                                ⋆                                16                                                            +                              i                              +                              Vh                                                        ,                                                                                          m                                ⋆                                16                                                            +                                                              2                                ⋆                                j                                                            +                              1                              +                              Vv                                                                                )                                                                                                                                          "RightBracketingBar"                                              (        F8        )            
The vector (Vh,Vv)=(Vh2,Vv2) and f=f2 giving the minimum E2(Vh,Vv,f) is determined, and e2 is defined as E2(Vh1,Vv1,f2).
Finally, e0 and e1+e2 are compared with each other. If e0 is larger, the two vectors (Vh1,Vv1), (Vh2,Vv2) and f1, f2 indicating the fields, and the corresponding motion compensated prediction images Bxe2x80x2n,m(i,j):
Bxe2x80x2(n,m(i,2*j)=G(n*16+i+Vh1,m *16+2*j+f1+Vv1)
Bxe2x80x2(n,m(i,2*j+1)=G(n*16+i+Vh2,m *16+2*j+f2+Vv1)
are output.
If e0xe2x89xa6e1+e2, the vector (Vh0,Vv0) and the motion compensated prediction image Bxe2x80x2n,m(i,j)
Bxe2x80x2(n,m(i,j)=G(n*16+i+Vh0,m*16+j+Vv0)
are output.
The operation of the motion vector detector 25b is identical to that of the motion vector detection circuit 25a, except that the reference images used are those stored in the frame memory 24b. 
Because the conventional image signal encoding system is required to conduct the calculations of the equations (F4) to (F8), when the motion vector search range is widened to cope with the quickly moving pictures, the amount of calculation is increased, and as a result the size of the hardware had to be increased.
An object of the invention to solve the above problems.
Another object of the invention to reduce fluctuation in the motion vector between adjacent blocks.
A further object of the invention is to provide a system in which the range of motion vector search can be easily varied depending on the content of the picture.
Another object of the invention is to provide an image signal encoding system which can provide an adequate motion vector search range for a sequence of pictures with quick motion, without increasing the amount of information of motion vector codes, and with which the efficiency of coding is high.
Another object of the invention is to restrain an increase of the amount of codes, and to perform encoding with a high efficiency, even when a scene change occurs.
Another object of the invention is to enable encoding of quickly moving picture without increasing the size of the hardware and without degradation of the picture quality.
According to a first aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the step of:
finding a motion vector by means of a block matching method;
detecting a first distortion SEmc of motion compensated prediction associated with the motion vector;
detecting a second distortion SEnomc of prediction without motion compensation;
using the motion vector for inter-picture prediction encoding when SEnomc greater than SEmc+K, with K being a constant greater than 0; and
using a vector having a value zero, in place of the motion vector, for inter-picture prediction encoding when SEnomcxe2x89xa6SEmc+K.
The step of finding the motion vector may comprise the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of said matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation functions for evaluating the blocks in said reference image signals; and
detecting the motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal.
When the difference between the distortion of the motion compensated prediction using the motion vector found by the block matching method and the distortion of the prediction obtained without using the motion compensation is small, the motion vector is replaced by a zero vector, and the inter-picture prediction encoding without motion compensation is conducted. As a result, the motion vector need not be transmitted, and the efficiency of transmission of the motion vectors is improved.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the step of:
finding a motion vector by means of a block matching method;
detecting a first distortion SEmc of motion compensated prediction associated with the motion vector;
detecting a second distortion SEnomc of prediction without motion compensation;
using the motion vector for inter-picture prediction encoding when SEnomc greater than SEmc+K, with K being a constant not smaller than 0;
using a vector having a value zero, in place of the motion vector, for inter-picture prediction encoding when SEnomcxe2x89xa6SEmc+K; and
varying the value of the constant K according to the content of the image signal.
It may be so arranged that when the difference in the distortion in the block matching is small, the value of the constant K is reduced.
When the difference between the distortion of the motion compensated prediction using the motion vector found by the block matching method and the distortion of the prediction obtained without using the motion compensation is small, the motion vector is replaced by a zero vector, and the inter-picture prediction encoding without motion compensation is conducted. As a result, the motion vector need not be transmitted, and the efficiency of transmission of the motion vectors is improved.
In addition, for the image signals with which the difference in the distortion in the block matching is small, such as where a picture of a low contrast is panned, the value of K is decreased, and the condition for the zero vector to be selected is more difficult. As a result, degradation of picture quality is avoided.
The step of finding the motion vector may comprise the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation functions for evaluating the blocks in said reference image signals; and
detecting the motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal.
The step of finding the motion vector may selectively use at least two evaluating functions for determining the motion vector;
at least a first one of the evaluating functions contains, as its factor, the magnitude of a vector representing the position of the block in said reference image signal relative to said each of the matching blocks in said input image signal; and
the evaluating function is altered in accordance with the content of the picture.
It may be so arranged that at least a second one of the evaluation functions does not contain the magnitude of the motion vector, as its factor; and when the content of the image signal is such that the difference in the distortion in the block matching is small, said second one of the evaluating functions is used.
It may be so arranged that the evaluation function contains, as its factor, values representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal.
For the image signals with which the difference in the distortion in the block matching is small, such as where a picture of a low contrast is panned, the evaluation function, which does not contain, as its factor, the magnitude of the motion vector, is used, so that the condition for the zero vector to be selected is more difficult, and degradation in picture quality is avoided.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation function for evaluating the blocks in said reference image signals;
outputting the detected block as a prediction block; and
detecting a motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal;
wherein said evaluation function contains, as its factor, the magnitude of a vector representing the position of the block in the reference image relative to said each of the matching blocks in said input image.
It may be so arranged that the evaluation function also contains, as its factor, values representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal.
Because the evaluation function contains, as its factor, the magnitude of the vector, and because the vector giving the smallest value of the evaluation function is found to be the motion vector, the condition for the blocks in the reference image nearer to the block of the input image to be found the prediction image is easier. That is, if other conditions are identical, the block in the reference image nearer to the block in the input image is selected. The value of the motion vector tends to be smaller, and as a result, the efficiency of transmission of the motion vectors is improved, and degradation in the picture quality is also prevented.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of the matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation function for evaluating blocks in said reference image signals;
detecting a motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal;
outputting the motion vector (H,V) for use in the motion compensation for said each of the matching blocks when
S2xe2x89xa6S1+K
where S1 represents a prediction distortion for the detected motion vector (Hp,Vp) for a first one of said matching blocks,
S2 represents a prediction distortion for the motion vector (Hp,Vp) for a second one of the matching blocks being situated in the neighborhood, on the display screen or along the time axis, of said first one of the matching blocks, and output for use in the motion compensation previously, and
K represents a constant not smaller than 0; and
outputting the motion vector (H,V) for use in the motion compensation for said each of the matching blocks when the above inequality is not satisfied.
The above-recited method may further comprise the step of:
outputting, as a prediction block, the block corresponding to the motion vector (H,V) or (Hp,Vp) output for use in the motion compensation.
It may be so arranged that the evaluation function also contains, as its factor, values representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal.
With such an arrangement, when the second motion vector (Hp,Vp) is output, it is sufficient that a signal or code indicating that the motion vector to be sent is identical to the one previously used for encoding. As a result, the amount of information to be transmitted is reduced, and the motion vector transmission efficiency is improved, using a simple configuration. Moreover, as the second motion vector (Hp,Vp) is used more often, fluctuations in the motion vector between neighboring matching blocks is reduced, so that the picture quality is improved.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of the matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation function for evaluating blocks in said reference image signals;
detecting a motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal;
outputting the motion vector (Hp,Vp) for use in the motion compensation for said each of the matching blocks when
S2xe2x89xa6S1+K
where S1 represents a prediction distortion for the detected motion vector (H,V) for a first one of said matching blocks,
S2 represents a prediction distortion for the motion vector (Hp,Vp) for a second one of the matching blocks being situated in the neighborhood, on the display screen or along the time axis, of said first one of the matching blocks, and output for use in the motion compensation previously, and
K represents a constant, which is varied depending on the content of the input image signals; and
outputting the motion vector (H,V) for use in the motion compensation for said each of the matching blocks when the above inequality is not satisfied.
The above recited method may further comprises the step of:
outputting, as a prediction block, the block corresponding to the motion vector (H,V) or (Hp,Vp) output for use in the motion compensation.
It may be so arranged that the evaluation function also contains, as its factor, values representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal.
With such an arrangement, when the second motion vector (Vp,Vp) is output, it is sufficient that a signal or code indicating that the motion vector to be sent is identical to the one previously used for encoding. As a result, the amount of information to be transmitted is reduced, and the motion vector transmission efficiency is improved, using a simple configuration. Moreover, as the second motion vector (Hp,Vp) is used more often, fluctuations in the motion vector between neighboring matching blocks is reduced, so that the picture quality is improved.
In addition, where the motion is different from one block to another, the value of K can be made small, so as to restrain use of the second motion vector, and as a result, degradation in the picture quality can be prevented.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation function for evaluating the blocks in said reference image signals; and
detecting a motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal;
wherein said evaluation function contains, as its factor, the distance between a vector representing the position of the block in the reference image relative to said each of the matching blocks in said input image, and a motion vector for determined another one of said matching blocks being situated in the neighborhood, on the display screen or along the time axis, of said each of the matching blocks, and output for use in the motion compensation previously.
It may be so arranged that the evaluation function also contains, as its factor, values representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal.
The fluctuation in the motion vector between the adjacent matching blocks can be restrained, so that the picture quality can be improved. The transmission efficiency of the motion vector can also be improved.
The above recited method may further comprise the steps of:
outputting the motion vector (Hp,Vp) for use in the motion compensation for said each of the matching blocks when
S2xe2x89xa6S1+K
where S1 represents a prediction distortion for the detected motion vector (H,V) for a first one of said matching blocks,
S2 represents a prediction distortion for the motion vector (Hp,Vp) for a second one of the matching blocks being situated in the neighborhood, on the display screen or along the time axis, of said first one of the matching blocks, and output for use in the motion compensation previously, and
K represents a constant, which is varied depending on the content of the input image signals; and
outputting the motion vector (H,V) for use in the motion compensation for said each of the matching blocks when the above inequality is not satisfied.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the steps of:
receiving an input image signal for a first one of pictures in a series of pictures;
providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting, for each of the matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation function for evaluating the blocks in said reference image signals; and
detecting a motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal;
wherein said evaluation function is a sum of a distortion representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal, and an offset value determined in accordance with the magnitude of a vector representing a position of each of said blocks in said reference image signal relative to said each of the matching blocks in said input image signal.
Where the picture includes a number of identical or similar patterns repeated over a wide area, or is featureless, flat, and the difference in the predicted distortion is small, priority is given to the smaller motion vectors in the selection. As a result, the amount of information of the codes of the motion vectors to be transmitted can be reduced, and the quality of the picture is improved.
It may be so arranged that the offset value for the vector having a magnitude exceeding a predetermined value is set to a value larger than the values which said distortion can assume, so as to place a limit to the magnitude of the motion vector.
The range of the motion vector is varied, or is effectively limited in accordance with the content of the picture. In other words.
It may be so arranged that wherein said predetermined value selectively assumes powers of 2, and the length of the code representing the motion vector is selectively decided depending on the range of the motion vector.
The length of the code is changed according to the the range of the motion vector, so that the efficiency of transmission of the motion vector codes is improved.
According to another aspect of the invention, there is provided an image signal encoding method for encoding an image signal using motion compensation, comprising the step of:
receiving an input image signal for each of pictures in a series of pictures;
dividing said input image signal into matching blocks, each of said matching blocks consisting of signals for pixels adjacent to each other on a display screen;
detecting first motion vectors, for said matching blocks, by a block matching method through a search over a fixed search range;
storing the first motion vectors;
detecting second motion vectors, for said matching blocks, by a block matching method through a search over a variable search range within said fixed search range;
outputting the second motion vectors for use in the motion compensation;
updating the variable search range in accordance with the maximum value of the first motion vectors detected for the pictures encoded in the past.
It may be so arranged that said series of pictures include intra-encoded pictures, one-way predictive encoded pictures and bi-directionally predictive encoded pictures; and the variable search range is updated in accordance with the maximum value of the first motion vectors detected for the immediately preceding one-way predictive encoded picture if the image signal being encoded is one forming a one-way predictive encoded picture, and in accordance with the maximum value of the first motion vectors detected for the immediately succeeding one-way predictive encoded picture if the image signal being encoded is one forming a bi-directionally predictive encoded picture.
The above recited method may further comprise the steps of providing a reference image signal of a second one of said pictures, said second one of said pictures preceding said first one of said pictures; wherein
each said step of detecting the first motion vector and said step of detecting the second motion vector comprises detecting, for each of said matching blocks in said input image signal, a block in said reference image signal which yields a minimum value of an evaluation functions for evaluating the blocks in said reference image signals, and detecting the motion vector representing a position of said detected block relative to said each of the matching blocks in said input image signal.
The rage of the motion vector can be adaptively varied according to the content of the picture.
It may be so arranged that said evaluation function is a sum of a distortion representing differences between signals for pixels in said each of the matching blocks in said input image signal, and signals for pixels in the block in said reference image signal which correspond to said signals for pixels in said each of the matching blocks in said input image signal, and an offset value determined in accordance with the magnitude of a vector representing a position of each of said blocks in said reference image signal relative to said each of the matching blocks in said input image signal.
According to another aspect of the invention, there is provided an image signal encoding method performing motion compensation inter-picture prediction encoding, comprising the steps of:
detecting a speed of motion in a sequence of pictures;
for a part of the sequence of pictures detected to contain a quick motion, performing the prediction encoding using one-way prediction encoding; and
for a part of the sequence of pictures without a quick motion, performing prediction encoding using bi-directional prediction encoding.
The step of detecting the speed of motion may comprise:
detecting a value of an evaluation function representing differences between pixels in a first one of pictures in said series of pictures, and pixels in a second one of pictures in said series of pictures;
detecting a variance of said first one of the pictures;
finding that a quick motion is contained if at least one of the following condition (a) and (b) is satisfied:
(a) Sa greater than xcex10
(b) Sb less than xcex20 and Sa greater than xcex30
where Sa represents the detected value of the evaluation function,
Sb represents the detected variance,
xcex10, xcex20 and xcex30 are predetermined threshold values, with xcex10 greater than xcex30,
finding that no quick motion is contained if none of the above conditions (a) and (b) is satisfied.
For a sequence of pictures with a quick motion, the bi-directional prediction is not performed, so that the temporal distance between the reference picture and the encoded picture in the motion compensated prediction is short, and the motion vector search range can be made narrow, and the amount of motion vector codes is reduced. For a sequence of pictures without a quick motion, bi-directional prediction is used, so that the prediction encoding is achieved with a high efficiency.
According to another aspect of the invention, there is provided an image signal encoding method performing motion compensation inter-picture prediction encoding, comprising the steps of:
encoding, as a rule, every N-th pictures by intra-picture encoding (N being an integer);
detecting a scene change in a sequence of pictures;
encoding the picture at which a scene change is detected, by intra-picture encoding;
encoding every N-th picture as counted from the picture at which the scene change is detected by intra-picture encoding; and
encoding pictures which succeed said picture at which the scene change is detected, and which are other than the every N-th picture, by means other than intra-picture encoding.
The step of detecting the scene change may comprise:
detecting a value of an evaluation function representing differences between pixels in a first one of pictures in said series of pictures, and pixels in a second one of pictures in said series of pictures;
detecting a variance of said first one of the pictures;
finding that a scene change has occurred if at least one of the following condition (a) and (b) is satisfied:
(a) Sa greater than xcex11
(b) Sb less than xcex21 and Sa greater than xcex31
where Sa represents the detected value of the evaluation function,
Sb represents the detected variance,
xcex11, xcex21 and xcex31 are predetermined threshold values, with xcex11 greater than xcex31,
finding that no scene change has occurred if none of the above conditions (a) and (b) is satisfied.
By performing the intra-picture encoding for the picture at which the scene change is detected, the degradation in the picture quality can be restrained. Increase of the codes can be restrained by not performing the intra-picture encoding until the N-th picture as counted from the picture at which the scene change is detected.
The above-recited method may further comprise the steps of:
encoding, as a rule, every M-th picture (M being an integer, and M less than N), by one-way prediction encoding, provided that the picture does not also fall every N-th picture;
encoding, as a rule, pictures other than every N-th and every M-th pictures by bi-directional prediction; and
encoding one or more pictures preceding the picture at which a scene change is detected, by one-way prediction.
Because one or more pictures before the scene change are encoded by one-way prediction, the encoding efficiency is further improved.
According to another aspect of the invention, there is provided an image signal encoding method performing motion compensation inter-picture prediction encoding, comprising the steps of:
encoding, as a rule, every N-th pictures by intra-picture encoding (N being an integer);
encoding, as a rule, every M-th picture (M being an integer, and M less than N), by one-way prediction encoding, provided that the picture does not also falls every N-th picture;
encoding, as a rule, pictures other than every N-th and every M-th pictures by bi-directional prediction;
detecting a scene change in a sequence of pictures;
encoding the first picture which would be encoded by one-way prediction if the scene change were not detected, by intra-picture encoding;
encoding, after the scene change is detected, every N-th picture as counted from said first picture, by intra-picture encoding; and
encoding, after the scene change is detected, every M-th picture as counted from said first picture, by one-way prediction, provided that the every M-th picture does not also falls the every N-th picture.
The step of detecting the scene change may comprise:
detecting a value of an evaluation function representing differences between pixels in a first one of pictures in said series of pictures, and pixels in a second one of pictures in said series of pictures;
detecting a variance of said first one of the pictures;
finding that a scene change has occurred if at least one of the following condition (a) and (b) is satisfied:
(a) Sa greater than xcex11
(b) Sb less than xcex21 and Sa greater than xcex31
where Sa represents the detected value of the evaluation function,
Sb represents the detected variance,
xcex11, xcex21 and xcex31 are predetermined threshold values, with xcex11 greater than xcex31,
finding that no scene change has occurred if none of the above conditions (a) and (b) is satisfied.
The method above-recited method may further comprise the step of:
encoding one or more pictures preceding the picture at which a scene change is detected, by one-way prediction.
By performing the intra-picture encoding for the first picture after the scene change for which intra-picture encoding or one-way prediction encoding was planned, degradation in the picture quality is restrained without changing the intervals between intra-picture encodings or prediction encodings.
According to another aspect of the invention, there is provided an image signal encoding method for performing motion compensation inter-picture prediction encoding on an image signal, comprising the steps of:
subsampling the image signal for each field; and
determining a motion vector using the field-subsampled image signal.
The amount of calculation for the motion vector search is reduced, and the size of the hardware can be reduced.
According to another aspect of the invention, there is provided an image signal encoding method for performing motion compensation inter-picture prediction encoding on an image signal, comprising the steps of:
subsampling the image signal for each frame; and
determining a motion vector using the frame-subsampled image signal.
The amount of calculation for the motion vector search is reduced, and the size of the hardware can be reduced.
According to another aspect of the invention, there is provided an image signal encoding system for performing motion compensation inter-picture prediction encoding, comprising:
means for determining a motion vector using a picture obtained by field-subsampling;
means for determining a motion vector using a picture obtained by frame-subsampling; and
means for making a selection between the motion compensation determined by the motion vector of the field subsamples and the motion compensation determined by the motion vector of the frame subsamples.
The amount of calculation for the motion vector search is reduced, and the picture quality can be reduced, because the better one of the motion compensation determined by the motion vector of the field subsamples and the motion compensation determined by the motion vector of the frame subsamples is selected.
According to another aspect of the invention, there is provided an image signal encoding system for performing motion compensation inter-picture prediction encoding, comprising:
means for subsampling field by field, and determining a motion vector using a picture obtained by the field-subsampling;
means for combining two fields of pictures obtained by said field subsampling to form a picture of a frame subsampling, and determining the motion vector using the picture of the frame subsampling; and
means for making a selection between the motion compensation determined by the motion vector of the field subsamples and the motion compensation determined by the motion vector of frame subsamples.
The amount of calculation for the motion vector search is reduced, and the picture quality can be reduced, because the better one of the motion compensation determined by the motion vector of the field subsamples and the motion compensation determined by the motion vector of the frame subsamples is selected. Moreover, because the picture of the frame subsampling can be obtained using two fields of field subsampling, both the field subsampling and frame subsampling can be achieved using a simple hardware.
It may be so arranged that when a non-interlace signal is input, said selection means selects the motion compensation determined by the motion vector of frame subsamples.
For non-interlace signals, encoding can be achieved using the motion vector of frame subsamples having a better accuracy.
The subsampling may be conducted such that the picture of said field subsamples maintains interlace configuration.
Where both of the motion vector of the field and motion vector of the frame are determined, the motion vectors can be determined accurately, and the picture quality can be improved.
The subsampling may be conducted such that the picture of the field subsamples has a non-interlace configuration.
Where motion compensation between fields of different parities is conducted, the motion vector can be determined accurately, and the picture quality can be improved.
The subsampling may conducted such that the field subsamples are at positions of the scanning lines of the original picture.
The motion vector between fields of an identical parity, and the motion vector between fields of different parities both have an integer accuracy, so that when motion compensation from both fields can be conducted easily, and control over the motion compensation is facilitated.
According to another aspect of the invention, there is provided an image signal encoding system for performing motion compensation inter-picture prediction encoding, comprising:
first motion vector detecting means for determining a first motion vector using a picture obtained by subsampling;
second motion vector detecting means for determining a second motion vector with an accuracy higher than the first motion vector, by conducting a motion vector search with a half-pixel accuracy of the original picture, over a range centered on a point representing said first motion vector;
wherein said second motion vector detecting means performs interpolation over the entire range of search by the second motion vector detecting means, and the search is conducted with said half-pixel accuracy over the entire search range.
The accuracy of the second motion vector is high, and the picture quality can be improved.
According to another aspect of the invention, there is provided an image signal encoding system for performing motion compensation inter-picture prediction encoding, comprising:
first motion vector detecting means for determining a first motion vector using a picture obtained by subsampling, with decimation factors 1/K and 1/L in the horizontal and vertical directions, with K and L being natural numbers; and
second motion vector detecting means for determining a second motion vector with an accuracy higher than the first motion vector, by conducting a motion vector search over a range centered on a point representing said first motion vector;
wherein said second motion vector detecting means performs interpolation over the range of search equal to wider than xc2x1K pixels in the horizontal direction by xc2x1L lines in the vertical direction.
The roughness of the search conducted by the first motion vector detecting means can be compensation by the second motion vector detecting means, so that the motion vector can be determined accurately, and the picture quality can be improved.