1. Field of the Invention
The present invention relates to an image processing apparatus, and more particularly to an image data coding process and a motion detecting process.
2. Related Background Art
Several schemes of coding (compressing) a large amount of moving image data have been used in practice. A typical one among them is MPEG 2 (Moving Picture Expert Group 2).
MPEG 2 has been realized by a combination of several data compression techniques using DCT (Discrete Cosine Transform), such as a orthogonal transform coding technique, a motion compensation bidirectional prediction coding technique, and a variable length coding technique.
The principle of a coding method by MPEG 2 will be described hereinunder.
In order to realize high efficiency coding by MPEG 2, redundancy in the time axis direction is first reduced by obtaining differences between images, and then redundancy in the space axis direction is reduced by using a discrete cosine transform (DCT) and variable length coding.
Reduction of redundancy in the time axis direction will first be described.
Generally, an image at a certain time among consecutive moving images is very similar to images after and before the certain time. Therefore, for example, if a difference between an image to be encoded now and an image forward in the time axis direction is transmitted as shown in FIG. 1, it becomes possible to reduce redundancy in the time axis direction and the amount of information to be transmitted. An image encoded in this manner is called a predictive-coded picture, P picture, to be described later.
Similarly, if a smaller one of differences between an image to be encoded now and an image forward or backward in the time axis direction or an image formed through interpolation of images forward and backward in the time axis direction, is transmitted, it becomes possible to reduce redundancy in the time axis direction and the amount of information to be transmitted. An image encoded in this manner is called a bidirectionally predictive-coded picture, B picture, to be described later.
In FIG. 1, images indicated by reference character I are intraframe-coded pictures, I pictures, to be described later. Images indicated by reference characters P and B are P and B pictures, respectively.
So-called motion compensation is performed in order to form a prediction image.
In motion compensation, for example, a block (hereinafter called a macro block) of 16xc3x9716 pixels constituted by a unit block of 8xc3x978 pixels is used, and an area near a macro block of an image before motion having a smallest difference from a prediction image is searched and a difference between image data in the searched area and the prediction image data is transmitted to reduce the amount of data to be encoded.
In practice, in the case of the P picture for example, of image data of a difference from a prediction image after motion compensation and image data before motion compensation itself, the image data having a smaller amount of data is selected in the unit of a macro block of 16xc3x9716 pixels and encoded.
However, in this case, it is necessary to send a larger amount of data at an area (image) appearing after the object moved. To avoid this, in the case of a B picture for example, image data having the smallest amount of data is encoded by selecting from four image data sets, including a difference from an already decoded, motion compensated image forward in the time axis direction, a difference from an already decoded, motion compensated image backward in the time axis direction, a difference from an interpolated image of the two forward and backward images, and the image data itself which is to be encoded now.
Next, reduction of redundancy in the space axis direction will be described.
For calculating a difference of image data, image data is subjected to a discrete cosine transform (DCT) with respect to each of a unit block of 8xc3x978 pixels. DCT transforms image data into frequency components. For example, image signals of a natural scene taken with a television camera are smooth in many cases. If DCT is executed for such smooth image signals, the data amount can be efficiently reduced.
Specifically, if a DCT is executed for such smooth image signals of a natural scene, large values are concentrated on some coefficients. As these coefficients are quantized, the 8xc3x978 coefficient block assumes nearly a zero value and only large coefficients are left. In transmitting 8xc3x978 coefficient block data, the data is Huffman coded in the order of so-called zigzag scan so that the data transmission amount can be reduced. The image is reconfigured at the decoder in the reverse order.
I, P, and B pictures will be described next.
In encoding an I picture, only closed information in a single image is used. Therefore, in decoding it, an image can be reconfigured by using only the information of the I picture itself. In practice, a difference is not calculated but the I picture itself is subjected to a DCT to encode it.
As a prediction image (image used as a reference for calculating a difference) of the P picture, an already decoded I or P picture forward in the time axis direction is used. In practice, a more efficient one of encoding image data of a difference from a motion compensated prediction image and encoding (intra-encoding) image data before motion compensation is selected in the unit of a macro block.
As a prediction image of a B picture, three picture types are used, including already decoded I and P pictures forward in the time axis direction and a picture formed through interpolation of I and P pictures is used. A most efficient one of encoding image data of differences of the above three picture types after motion compensation and intra-encoding is selected in the unit of a macro block.
Since a B picture uses bidirectional prediction, it is necessary to use motion vector detector circuits for forward and backward predictions. Since the circuits refer to different image memories, a twofold circuit scale is necessary as compared to a P picture, i.e., only forward prediction.
Coding efficiency is generally improved by bidirectional prediction coding which reduces prediction errors. However, coding efficiency improvement is not so much expected for rapid motion images such as real time sport scenes, images having a high frequency of fast camera spanning, and other images, and the decoded image quality is inevitably degraded.
In order to suppress this degradation, it is also necessary to reduce prediction errors for fast moving images. In improving the coding efficiency, it is therefore effective to expand a search range of motion compensation prediction and raise the precision of calculating a motion vector.
However, if the search range is expanded, the calculation amount increases correspondingly and the hardware amount of motion vector detector circuits increases, so that the cost of the apparatus becomes high.
It is an object of the present invention under the above-described background art to provide an image processing apparatus and method capable of suppressing degradation of the decoding quality of special images and improving the image quality, without prolonging the process time and without a large increase in cost.
According to a preferred embodiment of the invention, in order to attain the above objective, there is provided an image processing apparatus (method) capable of coding image data through bidirectional prediction, comprising storage means (step) for storing a reference image in a forward direction, in a first memory area, a reference image in a backward direction, in a second memory area, and a reference image in a forward direction in an expanded range, in a third memory area; first motion vector detecting means (step) for detecting a motion vector by reading image data stored in the first memory area; second motion vector detecting means (step) for detecting a motion vector by reading image data stored in the second or third memory area; and coding means (step) for coding, through motion compensation prediction, input image data by using a motion vector detected by the first or second motion vector detecting means.
It is another object of the present invention to provide an image processing apparatus and method capable of quickly and precisely judging correlation of image data, which correlation is used as parameters for switching between coding process modes.
According to another preferred embodiment of the invention, in order to attain the above objective, there is provided an image processing apparatus (method) comprising absolute difference value calculating means (step) for calculating an absolute difference value of each pixel between image data of first and second fields constituting a frame image; first comparison means (step) for comparing the absolute difference value calculated by the absolute difference value calculating means with a first threshold value; calculation means (step) for calculating a sum total of comparison results of the first comparison means; and judging means (step) for judging a field correlation of the image data between the first and second fields in accordance with the sum total calculated by the calculation means.
According to still another preferred embodiment of the invention, there is also provided an image processing apparatus (method) comprising absolute difference value calculating means (step) for calculating an absolute difference value of each pixel between image data of first and second fields constituting a frame image; first comparison means (step) for comparing the absolute difference value calculated by the absolute difference value calculating means with a first threshold value; count means (step) for counting comparison results of the first comparison means; and judging means (step) for judging a field correlation of the image data between the first and second fields in accordance with a count value of the count means.
Other objects, features and advantages of the invention will become apparent from the following detailed description taken in conjunction with the accompanying drawings.