1. Field of the Invention
The present invention relates to a moving picture encoding apparatus and a moving picture encoding method for encoding moving picture data transmitted from a network or radio communication.
2. Description of the Related Art
A moving picture encoding apparatus according to the present invention is an apparatus for performing frame-by-frame encoding, using moving picture encoding methods like ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) recommendation H.26x or ISO/IEC standard MPEG, that is, motion compensation or orthogonal transform (e.g. discrete cosine transform).
In general, in the moving picture encoding method represented by the ITU-T recommendation H.26x or ISO/IEC standard MPEG, input video signals are compressed based on spatial/temporal correlations. The compressed data is subjected to variable length encoding according to a predetermined procedure, thereby producing code sequences (bit streams).
A moving picture encoding method according to MPEG-4 will now be described as an example.
A video signal comprises a plurality of video object planes (VOP). In case that a VOP has a rectangular shape, it corresponds to a frame or a field in MPEG-1, 2. The video signal is compressed based on spatial/temporal correlations of VOPs.
The VOP comprises a luminance signal and a chrominance signal. The VOP comprises a plurality of macro blocks (MB). The MB for the luminance signal consists of 16 pixels in each of vertical and horizontal axes. Spatial/temporal compression is performed in units of an MB.
DCT (Discrete Cosine Transform) and quantization are employed in the spatial compression. MC (Motion Compensation) is used in the temporal compression.
The VOP-unit compression methods include an intra-coding type (intra-coding), in which encoding is effected by only spatial compression, and an inter-coding type (inter-coding), in which encoding is effected by both spatial compression and temporal compression.
In general, the VOP subjected to the intra-coding is called an I (Intra)-VOP. As regards VOPs subjected to the inter-coding, the VOP encoded with MC, using as a reference VOP a temporally preceding encoded VOP, is called a P (Predictive)-VOP. On the other hand, the VOP encoded with bi-directional MC, using, as reference VOPs, temporally preceding and following encoded VOPs, is called a B (Bi-directionally predictive)-VOP.
The reference VOP is a VOP (two VOPs at most) temporally adjacent to the VOP to be currently encoded. The reference VOP is included in VOPs, which were encoded as I-VOPs or P-VOPs and decoded for inter-coding type.
All MBs included in the I-VOP must be encoded by intra-coding. However, the MBs included in the P-VOP or B-VOP may be encoded by either the intra-coding or the inter-coding.
According to the intra-mode/inter-mode determination of MB encoding in the MPEG-4 Video Verification Model Version 6.0 of ISO/IEC JTC1/SC29/WG11, the intra-coding is used when a sum A of all absolute values of difference values relative to an average value of all pixels of the MB and an MC error SAD meet the following condition,A<SAD−2×NBwherein NB is the number of pixels in the MBs included in the VOP.
The MB-unit encoding process will now be described.
In case that the VOP including an MB to be encoded is an I-VOP, a quantized DCT coefficient, which is obtained by subjecting a luminance signal and a chrominance signal to DCT and quantization, is compressed by variable-length encoding, and the resultant along with header information is processed according to a predetermined procedure, thus forming a bit stream.
On the other hand, in case that the VOP including an MB to be encoded is not the I-VOP, an encoded VOP, which is temporally adjacent to the VOP including the MB to be encoded, is used as a reference VOP. Using a motion detection method represented by a block matching method, an MB in the reference VOP is found, at which a difference value (MC error) in luminance signal, relative to the MB to be encoded, is minimum.
A vector indicating motion from an MB to be encoded to an MB at which the MC error takes a minimum value is called a motion vector.
The MC error is subjected to DCT and quantization. A quantized DCT coefficient obtained in connection with the acquired motion vector and the MC of the luminance and chrominance signals is compressed by variable length encoding, and the resultant along with header information is processed according to a predetermined procedure, thus forming a bit stream.
The moving picture encoding apparatus is required to produce a bit stream having an amount of codes, which is designated by predetermined encoding parameters. In addition, in order to prevent an overflow or underflow of data in a decoder-side buffer, the encoder side has to estimate an occupation amount in the decoder-side buffer and to control the code production amount.
This buffer is called a video buffering verifier (VBV) buffer.
In MPEG-4, the upper limit of the capacity of the VBV buffer is specified by the profile and level.
The code production amount is controlled by a quantization scale for quantizing a DCT coefficient, which is obtained by subjecting frames to DCT in units of an MB.
In general terms, the code production amount is inversely proportional to the quantization scale. Making use of this feature, the code production amount can be freely varied.
Moreover, in general terms, since the quantization scale is limited, it is not possible to control the code production amount on the basis of the quantization scale alone. If the code production amount is greater than a target value, a frame skip number is increased. If not, stuffing is performed.
If the frame skip number is increased, the frame encoding timing can be delayed and underflow of the VBV buffer can be prevented. On the other hand, overflow of the VBV buffer can be prevented by insertion of redundant bits, called “stuffing.”
As regards a scene with a large degree of motion, it is better to decrease the frame skip number in order to enhance the precision of prediction of motion. On the other hand, in a scene with a large degree of motion, the code production amount generally increases and underflow tends to occur in the VBV buffer.
To cope with this problem, in case that scenes with a large degree of motion continue for a relatively long time, the frame skip number has to be increased in order to prevent underflow of the VBV buffer. However, if the frame skip number is increased, the degree of correlation with the reference VOP used for MC decreases.
In a case of a scene with a particularly high degree of motion, it is highly possible that an object in an image moves to an area outside an area in which motion of an object is compensated. In this state, if predictive encoding is performed between less correlated VOPs, the motion vector increases and the MC error also increases.
The motion vector is a vector encoding a difference value from a predictive value of a motion vector, which is obtained by a motion vector between adjacent MBs (blocks). In most cases, since motion vectors of adjacent MBs or blocks are the same or similar, code sequences proportional to the difference value are assigned to these motion vectors.
As mentioned above, the MB of the inter-screen encoded VOP may be encoded by inter-coding or intra-coding. In the aforementioned estimation of less correlated VOPs, the number of intra-coded MBs is relatively large.
Taking the above into account, when the degree of correlation between the VOP to be encoded and the reference VOP is high, the difference value of the motion vector and the MC error decrease. Thus, the code production amount of the inter-coded VOP is remarkably reduced, compared to that of the intra-coded VOP.
In the prior-art moving picture encoding apparatus, in case that the correlation between the VOP to be encoded and the reference VOP is higher, the VOP is inter-coded.
On the other hand, when the frame skip number is large and the degree of correlation between the VOP to be encoded and the reference VOP is low, the number of MBs to be intra-coded according to the intra-mode/inter-mode determination of the MB encoding is relatively large. In this case, intra-coded MBs and inter-coded MBs tend to be mixed disorderly. Consequently, the motion vector prediction value of the inter-coded MB increases and the amount of codes increases. In addition, the MC error is relatively large, and the amount of codes of the VOP increases disadvantageously.
If the amount of codes increases, the frame skip number is further increases in a vicious spiral, and the efficiency in encoding deteriorates.
In the prior-art moving picture encoding apparatus, in case that the degree of correlation between the VOP to be encoded and the reference VOP is low during inter-coding, the intra-coding is performed in units of an MB. Consequently, the amount of codes of the VOP increases disadvantageously.