1. Field of the Invention
The present invention relates to a bandwidth compressed transmission system for transmitting a wide-band high definition color television picture signal which is rearranged to be well adapted to a narrow band transmission and more particularly to a bandwidth compressed transmission system in which motion detection can be satisfactorily and quickly carried out on the side of a decoder.
2. Description of the Prior Art
NHK (Nippon Hoso Kyokai or Japan Broadcasting Corporation) has proposed a bandwidth compressed transmission system for broadcasting 1125-line HDTV (High Definition Television) pictures with a 5:3 aspect ratio on one channel. This bandwidth compression system is called MUSE (Multiple Sub-Nyquist Sampling Encoding), and is a motion compensated subsampling system.
The above-described MUSE system will be briefly described below. First, sampling and interpolation in the MUSE system will be described.
A combination of the phase-alternating Sub-Nyquist sampling method and a technique used in Motion-Compensated Interframe Coding was applied to bandwidth-reduction for the analog transmission of high-definition television, and equipment for a 1125-line system has been developed.
Table 1 gives the most important characteristics of the MUSE system, and FIG. 1 illustrates the sampling pattern of the system. The sampling is of a multiple dot-interlace type, and the cycle of the sequence is a period of four fields.
For a still picture-area (portions of the field where the picture is still), and HDTV picture can be reconstructed by temporal interpolation, using samples of signals from all four fields. A transmissible region of the spatial-frequency domain for a still picture-area is shown in FIG. 2B.
For a moving picture-area, the final picture is constructed by spatial interpolation, using signals sampled from a single field. If the signals of two or more fields are used to reconstruct a moving picture, the technical quality of the picture is degraded because of multi-line blur.
By using spatial interpolation, the transmissible area is narrowed, as shown in FIG. 2B. This shows that the picture will be blurred in moving portions of the picture with an uncovered background. However, this degradation of quality is not serious, because the human perception of sharpness is not very sensitive to blur in moving portions of the picture.
TABLE 1 __________________________________________________________________________ Characteristics of the MUSE system __________________________________________________________________________ System Motion-compensated multiple subsampling system (Multiplexing of C signal is TCI format) Scanning 1125/60 2:1 Bandwidth of 8.1 MHz (-6 dB) transmission baseband signal Resampling clock 16.2 MHz rate Horizontal (Y) 20-22 MHz (for stationary portion of the picture) bandwidth 12.5 MHz*(for moving portion of the picture) (C) 7.0 MHz (for stationary portion of the picture) 3.1 MHz*(for moving portion of the picture) Synchronization Positive digital synchronization Audio and additional PCM multiplexed in VBLK using 4-phase DPSK (2048 Kb/s) information __________________________________________________________________________ *Values of a prototype receiver: these values should be 16 MHz and 4 MHz, if a perfect digital twodimensional filter could be used.
In the case of movement caused by panning and tilting, the blur is more noticeable. To avoid this effect of spatial interpolation, motion-compensation is introduced. A vector representing the motion of a scene is calculated for each field by the encoder, and a vector signal is multiplexed in the vertical banking period and transmitted to the receiver. In the decoder, the position of sampled picture-elements of the preceding field are shifted according to the motion vector.
Together with this motion-compensation, temporal interpolation can be applied to panned or tilted scenes with no resultant blur. As shown in FIG. 2B, the maximum vertical transmissible frequency for moving portions of the picture is only half that for still portions because of the 2:1 interlace scanning of the original HDTV signal. If spatial interpolation is used for a still portion of the picture, the maximum transmissible vertical spatial frequency is doubled and equal to 1/2h, where h is a space between two horizontal scanning lines.
Next, the system construction will be described. Block diagrams of a MUSE transmitter and receiver are shown in FIGS. 3A and 3B. First, the HDTV video signal is encoded into a TCI signal by a TCI encoder 2. One example of a waveform of TCI with a line-sequential chrominance signal is illustrated in FIG. 4. The sampling frequency of the TCI signal is 64.8 MHz. Before the signal is subsampled at 16.2 MHz, prefilters 4 and 6, respectively, for a still and moving areas are applied according to whether the portion of the picture is moving or still. Ideal characteristics for these two filters 4 and 6 are shown in FIGS. 2A and 2B.
A mixer 8 mixes the outputs of the two filters 4 and 6. A mixing ratio of the mixer 8 corresponds to the motion of the picture, which is detected pixel-wise. The mixed output is subsampled by a subsampling circuit 10. Certain control signals, like motion vectors, are combined with the subsampled signal by a multiplexer 12. The combined MUSE signal is then FM-modulated by an FM modulator 14.
An audio signal is by modulated 4-phase DPSK in a PSK modulator 22. The DPSK signals are multiplexed with the video signal by a switch 15, utilizing the vertical blanking interval, after frequency-modulation by the MUSE signal. The control signals are transmitted in the vertical blanking interval and are multiplexed with the baseband signal.
In the receiver, as shown in FIG. 3B, the received signal is demondulated by an FM demodulator 24 and a PSK demodulator 26 to obtain demodulated video and audio signals, respectively. The demodulated video signal is demultiplexed by a demultiplexer 28 to obtain demultiplexed video and control outputs. The video output is applied to a spatial interpolator 32 and a temporal interpolator 34. Here, the two interpolators 32 and 34 are employed, according to whether the portion of the picture is moving or still. That is, moving area is detected by a detector 36 and the detected signal controls the mixer 38. The output from the mixer 38 is applied to a TCI decoder to obtain a video signal corresponding to the original video signal.
A mixer 38 mixes the outputs from the temporal and spatial interpolators 32 and 34. The mixer 38 should be controlled pixel-wise, but in this case, the transmission-rate of the control signal would be so high that the signal could not be transmitted. Motion must, therefore, be detected by the receiver, using the subsampled transmitted signal, and in the following MUSE system proposed by NHK, motion can be detected accurately.
In the MUSE system, motion detection is conducted as follows. Whether a picture element is in a moving portion or a still portion of the picture, its motion can be detected by signal differences with the preceding frame. Exact interframe differences cannot be obtained from the transmitted MUSE signal because it is subsampled, but the difference between a frame and the next frame can be obtained exactly, and used instead of the real differences. In some cases, real movement information is not given by this method, as shown in FIG. 5. The moving portion labelled .beta. cannot be detected from the signal of the next frame but one. The simplest way to overcome this difficulty is to extend temporally the difference in the next frame but one, as shown in FIG. 5.
For almost all HDTV pictures, this motion detecting method can be used, but there are a few exceptions, such as a grid pattern panned at a particular speed, which gives the same partial pictures as the preceding frame and yields no movement information.
For such a scene, a quasi-interframe difference is employed, which is the difference between the current frame and the preceding frame obtained by spatial interpolation, and of course the picture is blurred.
With this method, a still portion of the picture which has a high spatial frequency component may be judged as a moving portion. The use of the quasi-frame difference should be limited to portions in which with a second interframe difference we should not detect the motion. The field signal is therefore separated into about 500 blocks, and which method is to be used for a still or motion picture portion is judged blockwise according to multiplexed transmission of block control signals in the vertical blanking period.
However, on the decoder side of the above-described MUSE system, there arises a problem in the detection of the moving picture area which is required to process signals by discriminating moving picture portions from still picture portions. That is, in the MUSE system, the subsampling cycle consists of two frames so that in the case of detection of motion, "an interframe difference" cannot be used (because of non-existence of an object for which a difference in motion is to be detected in one subsampling cycle). As a result, "a difference between next adjacent frames" must be detected, and consequently motion detection is unsatisfactory. In this specification, the term "interframe difference" is used to designate a signal level difference between, for example, first and second frames and the term "difference between next adjacent frames" is used to designate a signal level difference between, for example, the first and third frames.
The reason why motion detection is unsatisfactory will be described in more detail hereinafter.
With respect to a still picture portion, interpolation can be made by using a signal in the previous frame, whereas such interpolation cannot be applied to a motion picture portion. Consequently, interpolation for motion picture portion is made by using a signal within a frame. Because of these different modes of interpolation, it is required to process segmentation between still and motion regions.
It follows, therefore, that on the decoder (receiver) side, information of moving pictures must be detected with a high degree of accuracy in accordance with the transmitted picture signal, but in the MUSE system, the sampling frequency consists of two frame cycle as described above, so that information of moving pictures must be detected between two next adjacent frames and consequently motion detection is essentially incomplete.
The above-described relationship may be viewed from different standpoint as follows. It is assumed that a signal having a spectrum as shown in FIG. 6A be sampled at 32 MHz (a first sampling frequency) and subsequently at 16 MHz (a second sampling frequency). Then, as shown in FIGS. 6B and 6C, a high frequency component (8 MHz-24 MHz) of the transmitted baseband is aliased. In this case, the low and high frequency components are of course held in interleaving relationship with each other, so that they may not overlap each other. The term "the same phase between frames" used in FIG. 6C refers to a fact that when the high frequency component is aliased, the amplitude of the corresponding signal (for instance, 8-12 MHz) is the same phase in the succeeding frames. A similar definition is also applicable to the term "the same phase between fields".
However, as a result the interframe offset subsampling, i.e., second subsampling, the amplitudes of the high frequency components are opposite phase by 180.degree. in the succeeding frames, so that "an interframe difference" cannot be obtained from the waveform as shown in FIG. 6C. As a result, motion information must be derived from signals between two frames in which amplitudes of the high frequency components are the same phase.
In this specification, the term "interframe/interline offset subsampling" shown in FIG. 6C is used to refer to subsampling carried out by utilizing clocks whose phase is reversed in each frame and line and corresponds to the sampling points in, for instance, the 4n-th field and (4n+2)-th field as shown in FIG. 1.
The term "interfield offset sampling" is used to refer to the sampling carried out by utilizing clocks whose phase is reversed for every field. For instance, this sampling corresponds to the sampling points in the 4n-th field and the (4n+2)-th field and to the sampling points in the 4-th field and the (4n+1)-th field and the (4n+3)-th field shown in FIG. 1.