1. Field of the Invention
This invention relates to a coding method and coding apparatus for coding a digital video signal by reducing it.
2. Description of Related Art
Some conventional methods of coding digital video signals are disclosed in Japanese Patent Application Laid-Open No. 1-253382 (1989) and U.S. Pat. No. 4,394,774. The following explanation will be given referring to these conventional methods.
Transmitting or recording digital video signals can reduce channel noise or reading noise quality of a displayed image, and these digital signals are easily transmitted by a digital network of a telephone type. In spite of this, digitization of the sequence of a television image is carried out at such a high speed that digitized color television signals cannot generally be directly transmitted or recorded generally by an existing carrier. According to a notice 601 by CCIR, the digitization rate of color television signals is 216 Mbtis/s. Therefore, in order to apply the digitized color television signal to actual transmitting speed and recording speed, it is important to reduce the digitization rate.
In U.S. Pat. No. 4,394,774, a method of reducing the digitization rate to a factor 10 to 20, that is, 1/10 to 1/20 is mentioned. The method, based upon orthogonal transform, can compress redundancy by using the redundancy of neighboring picture elements in the image. This method is characterized by performing orthogonal transform, which decorrelates picture elements of a block by dividing an image into blocks of the same size and concentrating energy to small number of picture elements.
In order to gain equivalent information reduction from image-to image redundancy existing in a still part of an image, this method is often combined with an image-to image prediction technique.
According to this technique, either a block itself is transmitted (intra-frame mode) or a difference between this block and a block with a same spatial position as a preceding image after coding/decoding is transmitted (inter-frame mode). Accordingly a block with a minimum energy is transmitted.
If this image-to-image prediction introduces time-recursivity of coding, in other words, if the decoded preceding image can be used, only decoding of the image signal with the usual reproducing mode is possible by the prediction. This is characterized such that errors existing in receiving video signals or in reading a band are to exist in various images. In fact, there is a danger that various errors appear in an image as long as the block in which this error appears is coded with an inter-frame mode.
Moreover, the recursivity is not compatible with a consumer's video recording, that is, a home video recording. Because a random access to an image is excepted, this random access is necessary to realize a "quick search mode". Sometimes, to improve on the fault one image among N images is coded with intra-frame mode. However, as this method degrades quality of a displayed image, the number N should be large so that the degradation may be restricted, resulting in the limited improvement.
Japanese Patent Application Laid-Open No. 1-253382 (1989) provides a method of coding a video signal by which the video signal can be coded by image-to-image correlation without introducing the image-to-image recursivity. That is, this official report provides a method which is compatible with a consumer's video recording and is not sensitive to channel error.
In order to realize the coding method, the coding apparatus is provided with the following steps;
(a) a preliminary step which is an estimation of a principal movement from one image to another for relating a displacement vector to each image concerning a preceding image, the principal vector being a vector whose image-to-image difference is minimum, PA1 (b) a preliminary step which is a scan conversion for prescribing a form of three-dimensional (3-D) block, by dividing a sequence of video signals corresponding to image into groups, each of which corresponds to N continuous images, and by prescribing a three dimensional group including M lines and P picture elements in every line in an image plane of a group on one hand and in N continuous planes corresponding to N images of a group on the other hand, N two-dimensional blocks with M lines and P picture elements which compose each three-dimensional block of a same group being spatially shifted from one image to another by a displacement vector which has been estimated regarding each image.
The method gives a possibility of using a temporal redundancy of a signal, thanks to decorrelation realized by orthogonal transform of a still part of an image without substantial displacement. This method can be applied to the case of a sight or a general movement of a camera and even to the case where the movement effects almost all the parts of the sight. In the last two cases, this method is superior to that which uses an inter-frame mode and an intra-frame mode. Because it uses inter-frame correlation and at the same time uses the inter-frame/intra-frame process, it does not take these displacements into consideration. Moreover, as the effect is limited to N images, this process does not introduce any image-to-image recursivity at the time of coding, ensuring compatibility of satisfying immunity to noise with a "quick search mode" provided in a video recorder.
If the decrease of speed of a video signal in a non-interlaced image form is used, this process is especially effective. If a usable signal is interlaced, the form thereof is converted before coding, leading to the production of a non-interlaced video signal. Accordingly, as shown in FIG. 1, each image is composed of each frame, and a three-dimensional block is composed by taking a one-dimensional direction in the horizontal direction, a two-dimensional direction in the vertical direction, and a three dimensional direction in the temporal direction, reducing a redundant component of an image signal by orthogonal transform.
However, in an actual television screen, an interlaced scanning form is adopted, as shown in FIG. 2. In transmitting data of a moving picture, this method is profitable for preventing flickering without increasing information content to be transmitted. Therefore, the scanning of one screen is finished with half the number of scanning lines as is shown in FIG 2. At the next screen, a reduction of vertical resolution of an image is restricted by scanning lines which were not scanned in a previous screen. By this interlaced scanning form, as the number of screens transmitted at the same time becomes double that at the time of sequential scanning, generation of flickers is restricted. This roughly scanned screen is called a field, and two continuous fields form a frame, as shown in FIG. 3, the scanning rate being 60 fields by the NTSC (National Television System Committee) method.
According to a conventional coding method, as a three-dimensional block is composed by a non-interlaced signal, redundancy of an image signal could not always be reduced effectively as compared to interlaced image signal. Especially, when an interlaced image signal which has a great deal of motion is made non-interlaced form, effective decreasing of redundancy of an image signal cannot be obtained as in two-dimension, wherein spatial displacement and time displacement are mixed.
By the way, in the case of coding a digital video signal in an interlaced scanning form, spatial displacement in adjacent interfields is converted to time displacement due to the effect of the interlaced scanning form, and the pseudo part appears even, in the case of a complete still picture. Accordingly, when weighting and quantizing is performed on a high coefficient after orthogonal transform is carried out in the temporal direction in order to reduce information content at the time of the moving picture, the weighting and quantizing is also carried out in the pseude-moving part, resulting in deterioration of image quality of a still picture at the decoding side. In order to solve these problems, it is necessary to judge whether the picture is moving or still on every 3-D block and to perform weighting and quantizing of different levels corresponding to a moving picture or a still picture.