The present invention generally relates to a motion picture coding and/or decoding system and a motion picture coding and/or decoding method, and particularly, to a system and a method for coding and/or decoding an image-adaptive split region of a motion picture to permit a reproduction of the split region with a significant configuration.
1. Description of the Related Art
Recent years have observed an increased need for recording and/or transmitting a temporal sequence of motion pictures in an effectively compressing manner.
To implement such a need, a typical system includes an encoder for encoding a sequence of input motion pictures each respectively into a sequence of binary symbols as compressed codes thereof (hereafter sometimes "bitstream" or "coded picture"), a recording medium for recording a sequence of such coded pictures or a transmission line or ratio channel for transmitting the same, and a decoder for decoding the coded picture sequence to provide a sequence of decoded motion pictures (hereafter sometimes each respectively "decoded picture") representative of the input picture sequence. The decoded picture sequence is output to a display for a reproduciton of the input picture seqeunce.
In the encoder, a motion picture input thereto in a current frame of time is always concerned and called "current input picture" (hereafter sometimes "current picture" or "input picture"). The input picture is necessarily compared with a so-called "reference picture", which is a picture representative of an input picture in a previous frame of time (hereafter sometimes "previous picture"), to obtain a set of inter-frame motion data therebetween. The reference picture is further employed in combination with the motion data to generate a motion-compensated inter-frame prediction picture (hereafter sometimes "prediction picture") of the input picture.
The prediction picture is subtracted from the input picture to obtain a set of reduced pixel data representing a "difference picture" therebetween, which pixel data are compressedly coded by a first coding including an orthogonal transformation and a quantization, to provide a set of coded compressed data (hereafter sometimes "compressed data").
The compressed data as well as the motion data are compressedly symbolized by a second coding to provide the bitstream as the coded picture to be recoded or transmitted.
The compressed data may be decompressedly decoded by a local decoding in the encoder to obtain a local decoded difference picture, which may be added to the prediction picture to generate a "local decoded picture", which may be employed as the reference picture in a subsequent frame of time and for a monitoring at the encoder side.
In the decoder, the bitstream is decoded by a first decoding to obtain a set of decoded compressed data and a set of decoded motion data. The decoded compressed data are decompressedly decoded by a second decoding including an inverse quantization and an inverse orthogonal transformation to obtain a decoded difference picture. The decoded motion data are employed in combination with a reference picture at the decoder side to generate a decoded prediction picture. The decoded difference picture is added to the decoded prediction picture to provide the decoded picture to be output to a display. The decoded picture is employed as the reference picture at the decoder side in the subsequent frame.
In any process of the system, a concerned picture or part thereof is always mapped in an imaginary picture frame as an overwritable white canvas so that any pixel position is defined in a common coordinate system.
FIG. 1 is a block diagram of a conventional motion picture coding and decoding system.
The conventional system comprises an encoder, a decoder and a transmission line provided therebetween.
In the encoder, a motion detector 301 refers to a local decoded picture D401 (i.e. reference picture) of a previous frame, for detecting inter-frame motions in an input picture PI relative thereto to output a set of motion data D402. A motion-compensated inter-frame prediction circuit 302 is responsible for the motion data D402 to employ the local decoded picture D401 for generating therefrom to output a set of motion-compensated inter-frame prediction data D403 (i.e. prediction picture).
For the motion-compensated inter-frame prediciton, there is employed a typical process in which the input picture PI is divided into blocks of a predetermined size to be sequentially processed. According to international standard coding systems such as ITU-T/H.261 or ISO-IEC 11172-2 (MPEG-1) and 13818-2 (MPEG-2), an input picture is divided into blocks of a size of 16 pixels by 16 lines as a unit region to be processed for motion-compensated inter-frame prediction. The input picture PI may be divided into segments of an arbitrary predetermined size to be sequentially processed.
In the encoder of FIG. 1, a difference calculator or subtractor 303 calculates a difference between the input picture PI and the motion-compensated inter-frame prediction data D403, to output a set of difference data (i.e. difference picture). An orthogonal transformation circuit 304 makes an orthogonal transformation of the difference data to output a set of orthogonally transformed data. A quantization circuit 305 makes a quantization of the orthogonally transformed data to output a set of quantized data D404 (i.e. compressed data).
Moreover, in the encoder, an inverse quantization circuit 306 makes an inverse-quantization of the quantized data D404 to output a set of inverse-quantized data. An inverse orthogonal transformation circuit 307 makes an inverse orthogonal transformation of the inverse-quantized data to output a set of inverse-orthogonally transformed data (i.e. local decoded difference picture). An adder 308 makes an addition of the inverse-orthogonally transformed data and the motion-compensated inter-frame prediction data D403-to output a local decoded picture D405 (of a current frame) to a memory 309, where it is stored as a set of data to be employed in a subsequent frame as the local decoded picture D401 (of a previous frame) for reference use in an encoding of an input picture.
Further, in the encoder, a coding circuit 310 converts the motion data D402 and the quantized data D404 into a bitstream D406 as a coded picture PC to be supplied via the transmission line to the decoder.
In the decoder, the bitstream D406 supplied from the encoder is inverse-converted by a decoding circuit 311 into a set of motion data D407 (i.e. decoded motion data) and a set of quantized data D408 (i.e. decoded compressed data). An inverse quantization circuit 312 makes an inverse quantization of the quantized data D408 to output a set of inverse-quantized data. An inverse orthogonal transformation circuit 313 makes an inverse orthogonal transformation of the inverse-quantized data to output a set of inverse-orthogonally transformed data (i.e. decoded difference picture).
Moreover, in the decoder, a motion-compensated inter-frame prediction circuit 314 employs the motion data D407 and a decoded picture D409 of the previous frame to generate therefrom to output a set of motion-compensated inter-frame prediction data D410 (i.e. decoded prediction picture). An adder 315 makes an addition of the motion-compensated inter-frame prediction data D410 and the inverse-orthogonally transformed data to provide a set of resultant data D411 as a decoded picture PD (of the current frame) to be externally output.
The decoded picture PD is input to a memory 316, where it is stored as a set of data to be employed in the subsequent frame as the decoded picture D409 (of the previous frame) for reference use in a decoding of a coded picture.
In the conventional system described, an entirety of a decoded or local decoded picture as a reference picture stored in a memory is referred to for a block-level motion-compensated inter-frame prediction of an input picture, without considerations to a composition or configuration of a picked-up image of an object that may extend over two or more blocks of a predetermined size.
Therefore, at either or both of encoder and decoder sides, when a concerned region consisting of one or more blocks is decoded in a current frame, if the remaining region is not decoded, then the decoded region is seldom self-completed with respect to information on individual images therein, as some of them should have been partially associated with the remaining region. Accordingly, in a subsequent frame, a reference picture fails to provide a complete set of necessary data for a motion-compensated inter-frame prediction.
When this reference picture is referred to (at a motion detector and a prediction circuit in an encoder and/or a prediction circuit in a decoder), the incompleteness of data is succeeded in a set of resultant data, causing a distortion of each associated image in a local decoded picture and/or a decoded picture.
As like process is repeated every new frame, such a distortion will be accumulated, resulting in a failure to code and/or decode a significant picture.
To avoid such a failure, the conventional system needs encoding and decoding an entirety of an input picture.
To overcome such a deficiency, there has been proposed a picture selection method in the Japanese Patent Application Laid-Open Publication No. 4-186986, in which a television conference system receives a pair of separately supplied input pictures, one picture representing a continuous image of a still background and the other picture representing an intermittent image of a concerned person put on a continuous image of a mono-tone background, and selects either picture with a priority to a detection of the person image, before encoding the selected picture.
This conventional system however needs a pair of cameras employed either for picking up the still background and the other for picking up the mono-tone background in front of which the concerned person sometimes comes on. Still less, the person cannot appear in the still background.
To this point, there has been proposed a characteristic-adaptive split method in the Japanese Patent Application Laid-Open Publication No. 3-133290, in which a static picture coding system splits an input picture into a plurality of split regions each configured in an adaptive manner to a static characteristic (e.g. tone-level, luminance or frequency) of the picture itself and processes every split region for a compression coding such that a coded picture consists of compressed split data and compressed pixel data, thus permitting a heading monitoring of a selected split region.
This conventional system, however, is unavailable for any application to a motion picture, because it provides no motion data.
In this respect, there has been proposed a motion picture adaptive coding method in the Japanese Patent Application Laid-Open Publication No. 1-228384, in which a motion picture coding system selects a quantization parameter that represents a preferable unity of a quantization of a difference picture in dependence on a pixel data of a difference picture and on a result of an undefined adaptive splitting of a local decoded picture or of a motion-adaptive splitting of a motion-compensated inter-frame prediction picture, whereas a local decoding of compressed data is independent of the result of splitting so that a result of the local decoding is subjected to a filtering depending on the split result to obtain the local decoded picture, and no split data is transmitted to a decoder, where like decoding to the local decoding may be effected.
Therefore, this conventional system has a reduced quantity of compressed data with a penalty of a reduced image quality. Still less, notwithstanding a perception of an adaptive splitting of a predicted or referenced motion picture, it is difficult for this system to permit a coding or decoding of a selected motion image.
The present invention has been achieved with such points in mind.