1. Field of the Invention
The present invention relates to an image signal encoding method and an image signal encoding apparatus, an image signal decoding method and an image signal decoding apparatus, an image signal transmission method, and a recording medium which are suitable for use in systems for recording a moving image signal on a recording medium such as an magneto-optical disk or a magnetic tape and reproducing the moving image signal from the recording medium thereby displaying the reproduced image on a display device, or systems, such as a video conference system, a video telephone system, broadcasting equipment, a multimedia database retrieving system, for transmitting a moving image signal via a transmission line from a transmitting end to a receiving end so that the transmitted moving image is displayed on a displaying device at the receiving end, and also systems for editing and recording a moving image signal.
2. Description of the Related Art
In the art of moving-image transmission systems such as video conference systems or video telephone systems, it is known to convert an image signal into a compressed code on the basis of line-to-line and/or frame-to-frame correlation of the image signal so as to use a transmission line in a highly efficient fashion.
The encoding technique according to the MPEG (Moving Picture Experts Group) standard can provide a high compression efficiency and is widely used. The MPEG technique is a hybrid technique of motion prediction encoding and DCT (discrete cosine transform) encoding techniques.
In the MPEG standard, several profiles (functions) at various levels (associated with the image size or the like) are defined so that the standard can be applied to a wide variety of applications. Of these, the most basic one is the main profile at main level (MP@ML).
Referring to FIG. 44, an example of an encoder (image signal encoder) according to the MP@ML of the MPEG standard will be described below. An input image signal is supplied to a set of frame memories 1, and stored therein in the predetermined order. The image data to be encoded is applied, in units of macroblocks, to a motion vector extraction circuit (ME) 2. The motion vector extraction circuit 2 processes the image data for each frame as an I-picture, a P-picture, or a B-picture according to a predetermined procedure. In the above procedure, the processing mode is predefined for each frame of the image of the sequence, and each frame is processed as an I-picture, a P-picture, or a B-picture corresponding to the predefined processing mode (for example frames are processes in the order of I, B, P, B, P, . . . , B, P). Basically, I-pictures are subjected to intraframe encoding, and P-pictures and B-pictures are subjected to interframe prediction encoding, although the encoding mode for P-pictures and B-pictures is varied adaptively macroblock by macroblock in accordance with the prediction mode as will be described later.
The motion vector extraction circuit 2 extracts a motion vector with reference to a predetermined reference frame so as to perform motion compensation (interframe prediction). The motion compensation (interframe prediction) is performed in one of three modes: forward, backward, and forward-and-backward prediction modes. The prediction for a P-picture is performed only in the forward prediction mode, while the prediction for a B-picture is performed in one of the above-described three modes. The motion vector extraction circuit 2 selects a prediction mode which can lead to a minimum prediction error, and generates a predicted vector in the selected prediction mode.
The prediction error is compared for example with the dispersion of the given macroblock to be encoded. If the dispersion of the macroblock is smaller than the prediction error, prediction compensation encoding is not performed on that macroblock but, instead, intraframe encoding is performed. In this case, the prediction mode is referred to as an intraframe encoding mode. The motion vector extracted by the motion vector extraction circuit 2 and the information indicating the prediction mode employed are supplied to a variable-length encoder 6 and a motion compensation circuit (MC) 12.
The motion compensation circuit 12 generates a predicted image on the basis of the motion vector. The result is applied to arithmetic operation circuits 3 and 10. The arithmetic operation circuit 3 calculates the difference between the value of the given macroblock to be encoded and the value of the predicted image. The result is supplied as a difference image signal to a DCT circuit 4. In the case of an intramacroblock, the arithmetic operation circuit 3 directly transfers the value of the given macroblock to be encoded to the DCT circuit 4 without performing any operation.
The DCT circuit 4 performs a DCT (discrete cosine transform) operation on the input signal thereby generating DCT coefficients. The resultant DCT coefficients are supplied to a quantization circuit (Q) 5. The quantization circuit 5 quantizes the DCT coefficients in accordance with a quantization step depending on the amount of data stored in a transmission buffer 7. The quantized data is then supplied to the variable-length encoder 6.
The variable-length encoder 6 converts the quantized data supplied from the quantization circuit 5 into a variable-length code using for example the Huffman encoding technique, in accordance with the quantization step (scale) supplied from the quantization circuit 5. The obtained variable-length code is supplied to a transmission buffer 7.
The variable-length encoder 6 also receives the quantization step (scale) from the quantization circuit 5 and the motion vector as well as the information indicating the employed prediction mode (the intraframe prediction mode, the forward prediction mode, the backward prediction mode, or forward-and-backward prediction mode in which the prediction has been performed) from the motion vector extraction circuit 2, and converts these received data into variable-length codes.
The transmission buffer 7 stores the received encoded image data temporarily. A quantization control signal corresponding to the amount of data stored in the transmission buffer 7 is fed back to the quantization circuit 5 from the transmission buffer 7.
If the amount of residual data stored in the transmission buffer 7 reaches an upper allowable limit, the transmission buffer 7 generates a quantization control signal to the quantization circuit 5 so that the following quantization operation is performed using an increased quantization scale thereby decreasing the amount of quantized data. Conversely, if the amount of residual data decreases to a lower allowable limit, the transmission buffer 7 generates a quantization control signal to the quantization circuit 5 so that the following quantization operation is performed using a decreased quantization scale thereby increasing the amount of quantized data. In this way, an overflow or underflow in the transmission buffer 7 is prevented.
The data stored in the transmission buffer 7 is read out at a specified time and output over a transmission line or recorded on a recording medium.
The quantized data output by the quantization circuit 5 is also supplied to an inverse quantization circuit 8. The inverse quantization circuit 8 performs inverse quantization on the received data in accordance with the quantization step given by the quantization circuit 5. The data (DCT coefficients generated by means of the inverse quantization) output by the inverse quantization circuit 8 are supplied to an IDCT (inverse DCT) circuit 9 which in turn performs an inverse DCT operation on the received data. The arithmetic operation circuit 10 adds the predicted image signal to the signal output from the IDCT circuit 9 for each macroblock, and stores the resultant signal into a set of frame memories (FM) 11 so that the stored image signal will be used as the predicted image signal. In the case of an intramacroblock, the arithmetic operation circuit 10 directly transfers the macroblock output by the IDCT circuit 9 to the set of frame memories (FM) 11 without performing any operation.
With reference to FIG. 45, an example of a decoder (image signal decoder) for performing a decoding operation according to the MP@ML standard of the MPEG will be described below. Coded image data transmitted via the transmission line is received by a receiving circuit (not shown) or is reproduced by a reproducing apparatus. Such the coded image data is stored in a receiving buffer 21 temporarily and then supplied to a variable-length decoder (IVLC) 22. The variable-length decoder 22 performs an inverse variable-length encoding operation on the data supplied from the receiving buffer 21. The variable-length decoder 22 outputs a motion vector and information indicating the associated prediction mode to a motion compensation circuit 27. The variable-length decoder 22 also supplies a quantization step to the inverse quantization circuit 23. Furthermore, the variable-length decoded data is supplied from the variable-length decoder 22 to the inverse quantization circuit 23.
The inverse quantization circuit 23 performs an inverse quantization operation on the quantized data supplied from the variable-length decoder 22 using the quantization step supplied from the variable-length decoder 22, and supplies the resultant signal to an IDCT circuit 24. The IDCT circuit 24 performs an inverse DCT process on the data (DCT coefficients) output by the inverse quantization circuit 23, and supplies the resultant data to an arithmetic operation circuit 25.
When the image signal output by the IDCT circuit 24 is an I-picture data, the data is stored via the arithmetic operation circuit 25 in a set of frame memories 26 so that predicted image data can be produced later for use in processing an image signal input to the arithmetic operation circuit 25. The data output by the arithmetic operation circuit 25 is also output as a reproduced image signal to the outside.
In the case where the input bit stream is a P- or B-picture signal, the motion compensation circuit 27 generates a predicted image from the image signal stored in the set of frame memories 26 in accordance with the motion vector and the associated prediction mode supplied from the variable-length decoder 22, and outputs the resultant predicted image to the arithmetic operation circuit 25. The arithmetic operation circuit 25 adds the predicted image signal supplied from the motion compensation circuit 27 to the image signal received from the IDCT circuit 24 thereby creating an output image signal. In the case where the given image signal is a P-picture, the output signal of the arithmetic operation circuit 25 is stored in the set of frame memories 26 so that it can be used as a reference image signal in processing a subsequent image signal to be decoded. In the case of an intramacroblock, the signal is simply output without being subjected to any process via the arithmetic operation circuit 25.
In the MPEG standard, various profiles at various levels are also defined, and various tools are available. For example, scalability is available as one of these tools.
The scalability of the MPEG encoding technique makes it possible to encode various image signals having different image sizes at various frame rates. For example, in the case of the spatial scalability, when only a base layer bit stream is decoded, an image signal having a small image size may be decoded, while an image signal having a large image size may be decoded if both base layer and enhancement layer bit streams are decoded.
With reference to FIG. 46, an example of an encoder having the spatial scalability will be described below. In the spatial scaling, an image signal having a small image size is given as a base layer signal, while an image signal having a large image size is given as an enhancement layer signal.
The image signal in the base layer is first stored in a set of frame memories 1, and then is encoded in a manner similar to the MP@ML signal described above except that the output signal of an arithmetic operation circuit 10 is supplied not only to a set of frame memories 11 so that it is used as a prediction reference image signal in the base layer, but also to an up sampling circuit 31. The up sampling circuit 31 expands the received image signal supplied from the arithmetic operation circuit 10 up to an image size equal to the image size in the enhancement layer so that it is used as a prediction reference image signal in the enhancement layer.
On the other hand, the image signal in the enhancement layer is first stored in a set of frame memories 51. A motion vector extraction circuit 52 extracts a motion vector and determines a prediction mode, in a manner similar to the operation according to the MP@ML.
A motion compensation circuit 62 generates a predicted image signal using the motion vector in the prediction mode determined by the motion vector extraction circuit 52. The resultant signal is supplied to a weighting circuit (W) 34. The weighting circuit 34 multiplies the predicted image signal by a weighting factor W, and outputs the resultant signal to an arithmetic operation circuit 33.
The signal output from the arithmetic operation circuit 10, as described above, has been supplied to the up sampling circuit 31. The up sampling circuit 31 expands the image signal generated by the arithmetic operation circuit 10 up to a size equal to that of the image in the enhancement layer. The expanded image signal is supplied to a weighting circuit (1-W) 32. The weighting circuit 32 multiplies the image signal output from the up sampling circuit 31 by a weighting factor 1-W, and supplies the resultant signal to the arithmetic operation circuit 33.
The arithmetic operation circuit 33 generates a predicted image signal by adding together the image signals output by the weighting circuits 32 and 34, and outputs the resultant signal to an arithmetic operation circuit 53. The image signal output by the arithmetic operation circuit 33 is also input to an arithmetic operation circuit 60. The arithmetic operation circuit 60 adds together the image signal output by the arithmetic operation circuit 33 and an image signal output by an inverse DCT circuit 59. The resultant signal is stored in a set of frame memories 61 so that it is used as a predicted reference frame for the subsequent image signal to be encoded.
The arithmetic operation circuit 53 calculates the difference between the image signal to be encoded and the image signal output from the arithmetic operation circuit 33, and outputs the result as a difference image signal. However, in the case where the macroblock is to be processed in the intraframe encoding mode, the arithmetic operation circuit 53 directly supplies the image signal to be encoded to a DCT circuit 54 without performing any operation.
The DCT circuit 54 performs a DCT (discrete cosine transform) operation on the image signal output by the arithmetic operation circuit 53 thereby generating DCT coefficients. The generated DCT coefficients are supplied to a quantization circuit 55. The quantization circuit 55 quantizes the DCT coefficients, as in the operation for the MP@ML data, using a quantization scale determined in accordance with the amount of data stored in a transmission buffer 57. The resultant quantized data is supplied to a variable-length encoder 56. The variable-length encoder 56 performs a variable-length encoding operation on the quantized data (quantized DCT coefficients), and outputs the resultant data as an enhancement layer bit stream via the transmission buffer 57.
The quantized data from the quantization circuit 55 is also supplied to an inverse quantization circuit 58. The inverse quantization circuit 58 performs an inverse quantization operation on the received data using the same quantization scale as that employed by the quantization circuit 55. The resultant data is supplied to an inverse DCT circuit 59 and is subjected to an inverse DCT process. The result is supplied to the arithmetic operation circuit 60. The arithmetic operation circuit 60 adds together the image signal output from the arithmetic operation circuit 33 and the image signal output from the inverse DCT circuit 59, and stores the resultant signal in the set of frame memories 61.
The variable-length encoder 56 also receives the enhancement layer motion vector extracted by the motion vector extraction circuit 52 and the information indicating the associated prediction mode, the quantization scale employed by the quantization circuit 55, and the weighting factor W used by the weighting circuits 32 and 34. These data are encoded by the variable-length encoder 56, and resultant data is output. Then, an enhancement layer bit stream and a base layer bit stream are multiplexed by a multiplexer (not shown) and output via a transmission line or recorded on a recording medium.
Now referring to FIG. 47, an example of a decoder having the capability of spatial scaling will be described below. The base layer bit stream input to a reception buffer 21 is decoded in a similar manner to the MP@ML signal described above except that the output image signal of an arithmetic operation circuit 25 is not only supplied as a base layer image signal to the outside but also stored in the set of frame memories 26 so that it can be used as a prediction reference image signal in processing a subsequent image signal to be decoded. Furthermore, the output image signal of the arithmetic operation circuit 25 is also supplied to an up sampling circuit 81 so as to expand the image signal to an image size equal to the image size in the enhancement layer so that it is used as a prediction reference image signal in the enhancement layer.
On the other hand, the enhancement layer bit stream is stored in a reception buffer 71, and then supplied to a variable-length decoder 72. The variable-length decoder 72 performs a variable-length decoding operation on the received data thereby generating quantized DCT coefficients, a quantization scale, an enhancement layer motion vector, prediction mode data, and a weighting factor W. The variable-length decoded data output from the variable-length decoder 72 are supplied to an inverse quantization circuit 73. The inverse quantization circuit 73 performs an inverse quantization operation on the received data using the quantization scale. The resultant data is supplied to an inverse DCT circuit 74, and is subjected to an inverse DCT process. The resultant image signal is supplied to an arithmetic apparition circuit 75.
The motion compensation circuit 77 generates a predicted image signal according to the decoded motion vector and prediction mode, and supplies the resultant signal to a weighting circuit 84. The weighting circuit 84 multiplies the output signal of the motion compensation circuit 77 by the weighting factor W decoded, and supplies the result to an arithmetic operation circuit 83.
The output image signal of the arithmetic operation circuit 25 is output as a reproduced base layer image signal, and also supplied to the set of frame memories 26. Furthermore, the image signal output from the arithmetic operation circuit 25 is also supplied to the up sampling circuit 81 so as to expand it to an image size equal to the image size in the enhancement layer. The expanded image signal is then supplied to a weighting circuit 82. The weighting circuit 82 multiplies the image signal output from the up sampling circuit 81 by a weighting factor (1-W) decoded, and supplies the resultant signal to the arithmetic operation circuit 83.
Arithmetic operation circuit 83 adds together the output image signals of the weighting circuits 82 and 84, and supplies the result to the arithmetic operation circuit 75. The arithmetic operation circuit 75 adds the image signal output from the inverse DCT circuit 74 and the image signal output from the arithmetic operation circuit 83, thereby generating a reproduced enhancement layer image, which is supplied not only to the outside but also to a set of frame memories 76. The signal stored in the set of frame memories 76 is used as a prediction reference image signal in a later process to decode a subsequent image signal.
Although the above description deals with the operation of processing a luminance signal, the operation associated with a color difference signal is also performed in a similar manner except that the motion vector used for the luminance signal is reduced to half in both vertical and horizontal directions.
In addition to the MPEG standard, there are various standards for converting a moving image signal into a compressed code in a highly efficient manner. For example, the H.261 and H.263 standards established by the ITU-T are employed in encoding process especially for communication. Although there are some differences in the details associated with for example header information, the H.261 and H.263 standards are also based on the combination of motion compensation prediction encoding and DCT encoding, and thus an encoder and a decoder can be implemented in a similar manner to those described above.
It is also known in the art to compose an image by combining a plurality of images using a chromakey. In this technique, an image of an object is taken in front of a background having a particular uniform color such as blue. Areas having colors other than blue are extracted from the image, and the extracted image is combined with another image. In the above process, the signal representing the extracted areas is referred to as a key signal.
FIG. 48 illustrates the method of encoding a composite image signal. In FIG. 48, a background image F1 and a foreground image F2 are combined into a single image. The foreground image F2 is obtained by taking a picture of an object in front of a background having a particular color, and then extracting the areas having colors different from the background color. The extracted areas are represented by a key signal K1. A composite image F3 is obtained by combining the foreground image F2 and the background image F1 using the key signal K1. Then the composite image F3 is encoded according to an appropriate encoding technique such as the MPEG encoding technique. When the composite image is encoded, the information of the key signal is lost. Therefore, when the decoded composite image is edited or recomposed, it is difficult to change only the background image F1 while maintaining the foreground image F2 unchanged.
Instead, as shown in FIG. 49, the background image F1, the foreground image F2, and the key signal K1 may first be encoded separately, and then the respective encoded signals may be multiplexed into a single bit stream of a composite image F3.
FIG. 50 illustrates the technique of decoding the bit stream produced in the manner shown in FIG. 49 into a composite image F3. The bit stream is subjected to a demultiplexing process and is decomposed into separate bit streams of the image F1, the image F2, and the key signal K1, respectively. These bit streams are decoded separately so as to obtain a decoded image F1', a decoded image F2', and a decoded key signal K1. If the decoded image F1' is combined with the decoded image F2' using the decoded key signal K1, then it is possible to obtain a decoded composite image F3'. In this technique, it is possible to easily carry out re-edit or recomposition. For example it is possible to change only the background image F1 while maintaining the foreground image F2.
In the following description, a sequence of images such as images F1 and F2 constituting a composite image are referred to as a VO (video object). An image frame of a VO at a certain time is referred to as a VOP (video object plane). Each VOP consists of a luminance signal, a color difference signal, and a key signal.
An image frame refers to one image at a certain time. An image sequence is a set of image frames taken at various times. That is, each VO is a set of VOPs at various times. The size and position of each VO vary with time. That is, even if VOPs are included in the same VO, they can be differ in the size and position from one another.
FIGS. 51 and 52 illustrate an encoder and decoder, respectively, according to the present technique. An image signal is first input to a VO generator 101. The VO generator 101 decomposes the input signal into a background image signal, an image signal of each object, and an associated key signal. Each VO consists of an image signal and a key signal. The respective VOs of image signals output from the VO generator 101 are input to corresponding VOP generators 102-0 to 102-n. For example, the image signal and the key signal of VO-0 are input to the VOP generator 102-0, and the image signal and the key signal of VO-I are input to the VOP generator 102-1. Similarly, the image signal and the key signal of VO-n are input to the VOP generator 102-n, When the image signal represents a background, there is no key signal.
In the case of an image signal generated using a chromakey such as that shown in FIG. 49, the image signals VO-0 to VO-n and associated key signals output from the VO generator 101 are directly used as image signals of the respective VOs and associated key signals. When an image has no key signal or the key signal of the image is lost, a key signal is generated by extracting predetermined areas by means of image area division technique thereby generating a VO.
Each VOP generator 102-0 to 102-n extracts a minimum rectangular containing an object in the image from each image frame wherein the size of the rectangular is selected such that the number of pixels in the vertical direction and that in the horizontal direction are integral multiples of 16. The respective VOP generators 102-0 to 102-n then extract an image signal (luminance signal and color difference signal) and a key signal included in the corresponding rectangles, and output the extracted signals. The VOP generators also output a flag indicating the size of the VOPs and the position of the VOPs represented in absolute coordinates.
The output signals of the respective VOP generators 102-0 to 102-n are input to corresponding VOP encoders 103-0 to 103-n and encoded. The output signals of the VOP encoders 103-0 to 103-n are input to a multiplexer 104 and combined into a single bit stream.
When the bit stream containing multiplexed signals is input to the decoder shown in FIG. 52, the input bit stream is first demultiplexed by a demultiplexer 111 into separate bit streams associated with the respective VOs. The respective VO bit streams are input to corresponding VOP decoders 112-0 to 112-n and decoded. Thus, the image signals, key signal, the flags indicating the VOP sizes, and the flags indicating the positions of VOPs represented in absolute coordinates of the respective VOPs are reproduced by the respective VOP decoders 112-0 to 112-n. The reproduced signals are input to an image reconstruction circuit 113. The image reconstruction circuit 113 generates a reproduced image using the image signals, key signals, size flags, absolute coordinate position flags associated with the respective VOPs.
Refer ring to FIGS. 53 and 54, examples of the constructions of the VOP encoder 103-0 and the VOP decoder 112-0 are described below. In FIG. 53, The image signal and the key signal of each VOP are input to an image signal encoder 121 and a key signal encoder 122, respectively. The image signal encoder 121 encodes the image signal according to for example the MPEG or H.263 standard. The key signal encoder 122 encodes the received key signal by means of for example DPCM. Alternatively, motion compensation associated with the key signal may be performed using the motion vector detected by the image signal encoder 121, and the obtained differential signal may be encoded. The amount of bits generated in the key signal encoding is input to the image signal encoder 121 and is controlled so that the bit rate is maintained at a predetermined value.
The bit stream of the encode d image signal (motion vector and texture information) and the bit stream of the encoded key signal are input to a multiplexer 123 and combined into a single bit stream. The resultant bit stream is output via a transmission buffer 124.
When the bit stream is input to the VOP decoder shown in FIG. 54, the bit stream is first applied to a demultiplexer 131. The Demultiplexer 131 demultiplexes the received bit stream into the bit stream of the image signal (motion vector and texture information) and the bit stream of the key signal, which are then decoded by an image signal decoder 132 and a key signal decoder 133, respectively. In the case where the key signal is encoded by means of motion compensation, the motion vector decoded by the image signal decoder 132 is input to the key signal decoder 133 so that the key signal decoder 133 can decode the key signal using the motion vector.
The above-described method of decoding the image VOP by VOP has a problem associated with the motion compensation which occurs when the image is decoded VOP by VOP. The VOP varies in the size and position with time. That is, VOPs belonging to the same VO are differ in size and position from one another. Therefore, when a VOP which is different in time is referred to for example in the motion compensation process, it is required to encode the flag indicating the position and size of the VOP and transmit the encoded flag signal, as will be described in detail below with reference to FIG. 55.
In FIG. 55, an image F11 corresponds to a VOP at a time t of a certain video object VO0, and an image F12 corresponds to a VOP at the same time t of a video object V01. The images F11 and F12 are different in size from each other. The positions of the images F11 and F12 are represented by absolute coordinates OST0 and OST1, respectively.
If a VOP to be encoded and a VOP to be referred to are placed in an absolute coordinate system, and a reference position in absolute coordinates is transmitted as a motion vector, it becomes possible to realize motion compensation.
In this case, the motion compensation is performed as follows. In the following description, it is assumed that the image has an arbitrary shape. In the case where the VOP has a rectangular shape, the motion compensation can be performed according to the known method such as that defined in the H.263 standard.
FIG. 56 illustrates a current VOP to be encoded. The VOP has a rectangular shape containing an image object wherein the size of the rectangle is an integral multiple of 16 in both horizontal and vertical directions. The size of the rectangle of the VOP is selected such that the resultant rectangle is a minimum one which can contain the object. When the VOP is encoded, encoding and motion compensation are performed from one macroblock to another wherein each macroblock has a size of 16.times.16 pixels. The size of each macroblock may also be set to 8.times.8 pixels, and the motion compensation may be performed from one macroblock to another having the same size.
FIG. 57 illustrates a VOP to be referred to. The VOP is stored at a predetermined location of a frame memory in accordance with the flag indicating the position of the VOP in the absolute coordinates and the flag indication the VOP size. In the case of a VOP having an arbitrary share, when a motion vector is extracted, a problem occurs due to the fact that the VOP has an area containing an image and an area containing no image.
First, the process performed on the reference VOP will be described below. In the case where the reference VOP has an arbitrary shape, the pixel values in the area containing no image are calculated from the pixel values in the area containing an image as described below.
1. First, the pixel values in the outside of the image object, in which there is no image, are set to 0.
2. The VOP is then scanned in the horizontal direction. Each horizontal line of the VOP is divided into line segments in which all pixel values are 0 and line segments in which all pixels have values which are not equal to 0. Those line segments in which all pixels have values not equal to 0 are not subjected to any process. The other line segments can be divided into line segments whose both ends have non-zero pixel values and line segments whose one end is an end of the VOP and the other end is a non-zero pixel value. Those line segments whose both ends have non-zero pixel values are subjected to placement such that all pixel values on the line segments are replaced with the average of the pixel values at both ends. In the other case, the pixel values on the line segments are all replaced with the non-zero pixel value at one end.
3. The process step 2 is also performed in the vertical direction.
4. For those pixels which are changed in value in both process steps 2 and 3, the pixel values are replaced by means values.
5. For those pixels which have a pixel value of 0 when the process 4 has been completed, the pixel values are replaced by the value of a non-zero pixel at the nearest location. If there are two nearest non-zero pixels, the mean value of these two pixel values is employed.
When a motion vector is detected, the pixel values in non-image areas of a reference VOP are set to non-zero values according to the above-described method. A prediction error relative to the reference image is calculated for a macroblock to be encoded, and a vector which gives a minimum prediction error is employed as a motion vector. In this calculation process, the VOP to be encoded can be such a VOP having an arbitrary shape, or the macroblock to be encoded can include an area containing no image. When the macroblock includes an area containing no image, those pixels in the area containing no image are neglected in the calculation of the prediction error. That is, the prediction error is calculated using only those pixels corresponding to an image.
Whether each pixel in the VOP corresponds to an image or not can be judged by referring to the corresponding key signal. If the corresponding key signal has a value of 0, the pixel is not in an image. In the other case, the pixel is in an image.
When the motion vector is detected using the technique described above, it is required to perform a great amount of computations. Thus, there is a need for a method of performing computations in a more simple fashion.
In view of the above, it is an object of the present invention to provide a technique of improving the encoding efficiency thereby reducing the computation cost.