This invention relates to a picture signal encoding method and apparatus, applied to encoding for recording or transmission of moving picture signals, a signal recording medium having the encoded picture signals recorded thereon, and a picture signal decoding method and apparatus for decoding the encoded picture signals.
In a system for transmitting moving picture signals to remote places, as in a teleconference system or television telephone system, line correlation or inter-frame correlation of picture signals is employed for compression encoding of picture signals. When recording moving picture signals on a recording medium, such as a magneto-optical disc or a magnetic tape, and reproducing the recorded signals for display on a display unit, line correlation or inter-frame correlation of picture signals is utilized for high-efficiency compression encoding of picture signals for improving the recording efficiency on the recording medium.
That is, if desired to record the digital video signals of an extremely large information volume on a recording medium of small size and small recording capacity for prolonged recording time, means need to be provided for high efficiency encoding and recording of video signals and for high efficiency decoding of the read-out signals. Thus a high-efficiency encoding system, exemplified by a Moving Picture Expert Group (MPEG) system, exploiting the correlation of video signals, has been proposed for responding to these requirements.
FIG. 1 shows a prior-art example of the system configuration for encoding and decoding moving picture signals using the MPEG system.
In FIG. 1, a field picture supplied from a video tape recorder (VTR) 151 is converted by a scan converter 152 into a frame picture which is encoded by an encoder 153. With the MPEG system, inter-frame differences are found of the video signals for lowering redundancy along the time axis. Subsequently, orthogonal transform techniques, such as discrete cosine transform (DCT), are employed for lowering redundancy along the spatial axis, thereby efficiently encoding video signals. The encoded information may be recorded on a recording medium 154.
For reproducing a recording medium, having recorded thereon high-efficiency encoded signals, reproduced signals are processed by a decoding unit 155 by, for example, inverse orthogonal transform, for deciding a frame picture, which is then converted by a scan converter 156 for display on a monitor 157.
It is assumed that a film picture, obtained from a film picture, by a tele-cine technique using so-called 2-2 pull-down, is supplied from a VTR 151.
The 2-2 pull-down is a tele-cine technique extensively used in converting 24-picture-frame-per-second film pictures into 25-frame-per-second or 50-field-per-second video signals in accordance with the so-called phase alternation by line (PAL) system. This system consists in reading out each picture frame of a film in two video fields by interlaced scanning.
Since the two fields thus read out are read out from the same picture frame, the two fields converted into a frame structure may be handled as a non-interlaced frame. That is, this frame is equivalent to a frame obtained by reading out a picture film of a film with a video frame by non-interlaced scanning.
In general, a non-interlaced frame is higher in line-to-line correlation in the vertical direction than the interlaced frame and hence is higher in redundancy and frame encoding efficiency than a interlaced frame.
If 2:2 pull-down is performed regularly for the totality of the picture frames of the film, the frames entering an encoder 3 are necessarily non-interlaced frames, so that the frame encoding efficiency is high and hence no problem is presented.
If, with the conventional encoding system, a picture string converted from the non-interlaced picture by the so-called telecine operation by interlaced pictures is processed such as by editing, the encoding efficiency tends to be lowered. That is, if video signals, containing irregular 2:2 pull-down patterns due to subsequent processing such as field editing, are supplied from the VTR 151, the frame entering the encoder 153 is not necessarily the non-interlaced frame, so that, with the conventional encoder, the frame encoding efficiency is undesirably lowered. This will be explained by referring to FIGS. 2 and 3.
In FIG. 2A, non-interlaced pictures NF, such as picture frames of a motion picture, are converted by 2:2 pull-down of the tele-cine processing into interlaced pictures of, for example, the so-called PAL system. The sequence of the non-interlaced film pictures shown in FIG. 2A is converted by so-called telecine operations into a sequence of interlaced pictures of which each frame begins with a first field (top.sub.-- field), as shown in FIG. 2B. That is, the sequence shown in FIG. 2B is a sequence of frames CF each made up of the first field Ft and a second field Fb in this order. Conversely, the sequence of the non-interlaced picture NF of FIG. 2C is converted into a sequence of interlaced pictures of which each frame begins with a second field (bottom field), as shown in FIG. 2C. That is, the sequence shown in FIG. 2D is a sequence of frames CF each made up of the second field Fb and the first field Ft in this order.
If these sequences are combined together at edit points T.sub.E1, TE.sub.2, as shown, there is produced an irregular sequence which disrupts paired fields corresponding to the sequence of the non-interlaced pictures NF in the original picture, as shown in FIG. 2E. In the example shown in FIG. 2E, a lone field F.sub.x is produced directly at back of the edit point T.sub.E.
The sequence shown in FIG. 2E adversely affects the picture encoding efficiency, as now explained by referring to FIG. 3.
The irregular sequence shown in FIG. 2E is encoded during encoding as frames of combined rectangular frames CP as shown in FIG. 3A. If the frame encoded corresponds to the combination CP.sub.1 constituting the non-interlaced frame of the combination CP.sub.1 of the original picture, the encoding efficiency is high, as shown in FIG. 3B. Conversely, with the combination CP.sub.2 other than the correct combination of the non-interlaced picture NF, the picture contains strong high-frequency components in its edge portion, as shown in FIG. 3C, despite the fact that the original picture is the non-interlaced picture, thus lowering the encoding efficiency. In FIGS. 3B and 3C, the encoding efficiency is lowered in the case of a picture representing a true circle shifted transversely in the transverse direction.
As a technique for efficiently encoding a picture produced by tele-cine processing as described above, there has hitherto been proposed a method consisting of removing repeated fields from the 2:3 pulled down pictures and subsequently constructing the frames so that the input frames will become non-interlaced frames. However, the lone field not constituting the non-interlaced frame, produced by the irregular 2:2 pull-down as described above, is different from the iterative frame produced by 2:3 pull-down, and hence this technique cannot be employed for solving the problem.