This application claims benefit of Japanese Patent Application No. 2000-340208 filed on Nov. 8, 2000, the contents of which are incorporated by the reference.
The present invention relates to moving picture editing method, moving picture editing system and storing medium with moving picture editing programs stored therein and, more particularly, to moving picture editing method, moving picture editing system and string medium with moving picture editing program stored therein, in which compression coded moving picture is edited by utilizing inter-frame prediction based on motion compensation.
As moving picture compression method for compression coding moving picture, moving picture compression methods utilizing inter-frame prediction based on motion compensation have been extensively utilized. Among the moving picture compression methods of this type are “H. 261”, “H. 262” and “H. 263” recommended by International Telecommunication Union-Telecommunication Standardization Sector and “MPEG1”, “MPEG2” and “MPEG4” recommended by Moving Picture Expert Group.
In the moving picture compression methods of this type, intra-frame coding and inter-frame predictive coding are used as means for coding individual moving picture frames. In the intra-frame coding, only pixel data constituting frames as the subject of coding (hereinafter referred to as subject frames) are used for compression coding. In this case, it is possible to decode the original frames by using only coded data of intra-frame coded frames (hereinafter referred to I (intra) frames).
In the inter-frame predictive coding, on the other hand, frames preceding and/or succeeding subject frames are used as reference frames for compression coding by utilizing predicted picture obtained by motion compensation. The inter-frame predictive coding using frames preceding the subject frame as reference frames are referred to as preceding predictive coding. The inter-frame predictive coding utilizing frames succeeding the subject frames is referred to as succeeding predictive coding. The inter-frame predictive coding using frames both preceding and succeeding the subject frames is referred to as bilaterally predictive coding. Coded frames obtained by the preceding prediction coding are referred to as P (predictive) frames. Coded frames obtained by the bidirectionally predictive coding are referred to as B (bidirectionally predictive) frames.
As shown above, I frames are obtained by compression coding using only pixel data constituting the subject frames without utilizing any reference frame. P frames are obtained by compression coding using I or P frames preceding the subject frames as reference frames. B frames are obtained by compression coding using I and P frames both preceding and succeeding the subject frames as the reference frames. B frames are not used as reference frames. The compression coded moving picture is usually constituted by continuous frames such as I, B, B, P, B, B, P, B, B, . . . . The inter-frame predictive coding can increase the data compression efficiency compared to the intra-frame predictive coding. In order to decode subject frames from inter-frame predictive coded frames, however, frames obtained by decoding reference frames used in motion compensation are necessary. In the meantime, a moving picture compression method permitting the motion compensation with a pixel accuracy less than the decimal fraction such as half pixel unit has been proposed. In this moving picture compression method, pixel data at positions having non-integer coordinates can be generated by averaging pixel data present at 2 or 4 integer coordinate positions in the neighborhood of their coordinate positions. For details of the moving picture compression method as described before, reference is to be had to, for instance, “Generic Coding of Moving Picture and Associated Audio”, ISO/IEC JTC1/SC29/WG11N0502, 1993.7.
In editing such as deletion or extraction, in a given frame unit, from moving picture which has been compression coded in the moving picture compression method as described above, if the deleting or extracting process is performed from I, P or B frames directly without making decoding and re-encoding, it may fail to generate a moving picture which can be correctly decoded. For example, when connecting, in a first and a second moving picture with continuous I, B, B, P, B, B, P, B, B, . . . frames, the second P frame in the first moving picture and the first P frame in the second moving picture, results in a lack of the I frame in the second moving picture in the result. This is so because, as described before, the first P frame in the second moving picture is obtained by compression coding using the preceding I frame as the reference frame. Consequently, it becomes impossible to decode the first P frame in the second moving picture. This may be avoided by decoding all the frames of the first and second moving pictures before the connection, and re-encoding the result obtained by the connection. Doing so, however, requires enormous computational effort. Besides, the picture quality of the moving picture obtained as a result the connection by the re-encoding is greatly deteriorated.
Recently, moving picture editing methods, which permit editing compression coded moving pictures with a simple construction, with less computational effort and without any lack of frames necessary for the decoding, have been proposed, as disclosed in, for instance, Japanese Patent Laid-Open No. 8-205174 and Japanese Patent Laid-Open No. 7-154802.
First, the operation of a prior art moving picture editing system disclosed in the Japanese Patent Laid-Open No. 8-205174, will be described with reference to FIGS. 12(a) to 12(f). This technique is referred to as first prior art example.
In this case, it is assumed that a moving picture having compression coded in the order as shown in FIG. 12(b) is accumulated in a first accumulating part of the system and is displayed on a display part of the system in the order as shown in FIG. 12(a), and that dot-shaded frames (B5,4 frame up to B12,11 frame) among a plurality of frames shown in FIG. 12(a) are necessary for editing. Of the postscripts m and n in the individual frames, the postscript m represents the order of accumulation in the first accumulating part, and the postscript n represents the order of display on the display part.
First, the individual frames are read out in the order as shown in FIG. 12(b) from the first accumulating part, and are then decoded and displayed in the order as shown in FIG. 12(a) on the display part. When the start of editing is commanded by the system operator, tentative holding of the frames in a tentative holding part of the system is started from the prevailing frame displayed on the display part (hereinafter referred to as editing start frame). In the example shown in FIG. 12(a), the tentative holding is started from B5,4 frame. In the case that a B frame is the editing start frame, if a P frame to be displayed on the display part later than that B frame is detected as the prevailing displayed frame, a plurality of frames having been tentatively held up to this time (i.e. from B5,5 frame up to P4,6 frame shown in FIG. 2(c)) and the frame number of the detected P frame are stored and held as editing start data in the memory part. Also, succeeding frames (i.e., B8,7 frame up to P7,9 frame shown in FIG. 12(a)) are continually tentatively held in the order of display on the display part until an editing end command is provided by the operator. When the editing end command is provided by the operator, if a B frame is detected at this time as the prevailing frame displayed on the display part, a plurality of frames having been tentatively held up to this time (i.e., from B11,10 frame up to B12,11 frame shown in FIG. 12(a)) and the frame number of the editing end frame are stored and held as editing end data in the memory part.
When the plurality of frames and the editing start and end frames, necessary for the editing, have been selected in the above process, the coding is started. In the coding, as shown in FIG. 12(d), B5,4 frame, B6,5 frame and P4,6 frame stored and held as the editing start data in the memory part are intra-frame coded to I5,4 frame, I6,5 frame and I4,6 frame, respectively, and B11,10 and B12,11 frames stored and held as the editing end data in the memory part are intra-frame coded to I11,10 frame and I12,11 frame, respectively. Then, a plurality of frames (i.e., form P7,9 frame up to B9,8 frame shown in FIG. 12(b)) between the editing start and end frames are read out from the first accumulating part on the basis of the frame numbers of the editing start and end frames. These read-out frames and the plurality of frame intra-coded frames (i.e., I5,4 frame, I6,5 frame, I4,6 frame, I11,10 frame and I21,11 frame shown in FIG. 12(e)) are combined in such a way that they can be normally reproduced, thus obtaining a necessary moving picture. The combined data is displayed on the display part as shown in FIG. 12(f), and it is also accumulated in the order as shown in FIG. 12(e) in the second accumulation part. The frames having been accumulated in the second accumulating part, which constitute the moving picture, are not correlated to the other frames, and thus can be solely decoded. As shown, in this arrangement only the frames which become incapable, as a result of the editing, of being normally reproduced are coded. It is thus possible to reduce the computational effort of the coding.
Now, the operation of a prior art moving picture editing system disclosed in the Japanese Patent Laid-Open No. 7-154802, will be described with reference to FIG. 13. This technique will be hereinafter referred to as second prior art example.
A case will now be considered with reference to FIGS. 13(a) to 13(c) that, a compression coded moving picture as shown in FIG. 12(c) (hereinafter refer to as third moving picture) is produced by connecting each other at editing point as shown by dashed line X, a compression coded moving picture as shown in FIG. 12(a) (hereinafter referred to as first moving picture) and a compression coded moving picture as shown in FIG. 12(b) (hereinafter referred to as second moving picture). In FIG. 13(c), prime symbol “′” means that a frame once decoded has been coded again.
It is now assumed that frames from I frame of frame number 2 up to B frame of frame number 4 earlier than the editing point, constitute a first and a second inputted moving picture. In this case, a CPU (central processing unit) in the moving picture editing system outputs the inputted frames constituting the first moving picture directly as a third moving picture. The inputted frames constituting the second moving picture are successively decoded in the decoding part.
When frames succeeding the editing point are inputted, the CPU operates as follows. When P frame right after the editing point (i.e., P frame of frame number 8 in FIG. 13(b)) is inputted as a decoded frame constituting the second moving picture, the CPU controls the coding part to code the P frame to I frame and, as shown in FIG. 13(c), outputs the coded frame (i.e., P frame of frame number 8 in FIG. 13(c) as third moving picture. When B frames (i.e., B frames of frame numbers 6 and 7 in FIG. 13(b)) are inputted as decoded frames constituting the second moving picture, I frame of frame number 5, necessary for the preceding predictive coding, lacks. Accordingly, the CPU controls the coding part to re-encode these B frames to P frames (i.e., P′ frames of frame numbers 6 and 7 in FIG. 13(c)) by utilizing the succeeding predictive coding.
As for P and B frames of frame number 11 and following frame numbers constituting the second moving picture as shown in FIG. 13(b), since the frames utilized for the inter-frame predictive coding have been re-encoded, the CPU resets the predicted picture to a right one, and controls the coding part to re-encode P frames to P frames and B frames to B frames (i.e., P′ and B′ frames of frame number 11 and following frame numbers in FIG. 13(c)). In these re-encoding processes, what has been obtained by decoding the frames constituting the second moving picture as shown in FIG. 13(b) is used as motion compensation mode data, motion vector, DCT (discrete cosine transform) switching data. Thus, prior art motion detection circuit and DCT mode judging circuit which require enormous computational efforts are unnecessary, and the motion compensation circuit can be replaced with one having a simpler construction.
In the first and second prior art examples as moving picture editing system as described above, when re-encoding P and B frames to I frames, re-encoding error is generated in the re-encoded I frames. Thus, when using I frame with an error generated therein by editing as a reference frame for re-encoding P and B frames, the re-encoding results in a change in the motion compensation picture after the editing. This motion compensation picture change leads to generation of errors in the P and B frames. Particularly, when a moving picture constituted by continuous P frames is edited by deleting frames of frame numbers 1 to 3 as shown in FIG. 14(a), re-encoding error generated when re-encoding the I frame shown in FIG. 14(b) is propagated up to the last frame to increase the picture quality deterioration.