Image compression methods represented by an MPEG (Moving Picture Coding Experts Group/Moving Picture Experts Group) method achieve a high compression efficiency by compression encoding an image signal using interframe prediction. However, where it is intended to edit images, since compressed image materials formed using interframe prediction cannot be spliced together while they remain in the form of a compressed image signal because they have a relationship of compressed signals by prediction between frames. Therefore, a system which is configured taking it into consideration in advance to edit image materials usually performs encoding using only compression within a frame without using interframe prediction.
However, where an image signal of a high definition having a large information amount is handled like, for example, a high definition (HD) signal, if only interframe compression is used for encoding, then only a low compression efficiency is obtained. Therefore, in order to transmit or store a large amount of data, an expensive system is required in that a high transfer speed is required, a large storage capacity is required or a high processing speed is required. In other words, in order to allow an image signal of a high definition having a large amount of information to be handled by a less expensive system, it is necessary to use interframe prediction to assure high compression efficiency.
In the MPEG system, a compression coding system which uses bidirectional interframe prediction and involves I pictures, P pictures and B pictures is called compression of the Long GOP (Group of Pictures) system.
An I picture is an interframe coded picture coded independently of any other picture, and an image can be decoded from information only of an I picture. A P picture is an interframe forward predictive coded picture represented by a difference from a preceding frame (in the forward direction) with respect to time. A B picture is a bidirectional predictive coded picture coded by motion compensation interframe prediction making use of preceding (in the forward direction), succeeding (in the reverse direction) or preceding and succeeding (in the opposite directions (bidirectional)) pictures with respect to time.
Since the P picture and the B picture have a smaller data amount than the I picture, if the GOP is made longer (that is, if the number of pictures which form a Long GOP is increased), then the compression ratio of the image can be raised. Therefore, the P picture and the B picture are suitable for utilization in digital broadcasting applications and DVD (Digital Versatile Disk) video applications. However, if the GOP is excessively long, then the editing control in the frame accuracy becomes difficult, and a problem in operation takes place in editing in business applications.
A process of splicing two image data compressed by the Long GOP method each other at predetermined editing points is described with reference to FIG. 1.
First, for each of editing object compressed image data 1 and editing object compressed image data 2, partial decoding of apportion in the proximity of an editing point is performed. Consequently, partial non-compressed image signal 1 and image signal 2 are obtained. Then, the non-compressed image signal 1 and image signal 2 are spliced each other at the editing points, and an effect is applied to the portion in the proximity of the editing point as occasion demands and then re-encoding is performed. Then, the re-encoded compressed image data is spliced with the compressed image data which have not undergone the decoding and re-encoding processes (compressed image data other than the portion for which the partial decoding is performed).
The method described above with reference to FIG. 1 is advantageous in that deterioration of the picture quality by re-encoding can be suppressed locally and the editing processing time can be reduced significantly when compared with those of an alternative method wherein all image data of compressed editing materials are decoded and then the image signals are connected to each other at the editing points, whereafter all of the image signals are re-encoded to obtain edited compressed video data.
However, if such a method as described above with reference to FIG. 1 is used to perform editing and re-encoding, then this gives rise to a problem that a picture cannot be referred to at a joint between a portion for which re-encoding is performed and another portion for which no re-encoding is performed.
The following method is known as a countermeasure for the problem described. In particular, where compression is performed using a method (Long GOP) which involves predictive encoding between frames, in order to implement editing comparatively and simply, the interframe prediction is limited so as to adopt a Closed GOP structure such that a picture is referred to only within a GOP but is not referred to across GOPs.
A case wherein limitation to interframe prediction is applied is described with reference to FIG. 2. FIG. 2 illustrates a list of pictures in a display order in regard to data of the compressed material image 1 and data of the compressed material image 2 of an object of editing, data of partially re-encoded data of compressed pictures in the proximity of the editing points after the editing and data of compressed images of a portion for which re-encoding is not performed in order to indicate a relationship between interframe prediction and editing. An arrow mark in FIG. 2 indicates a referencing direction of a picture (this similarly applies also to the other figures). In FIG. 2, 15 pictures of BBIBBPBBPBBPBBP of the display order form one GOP, and referencing to a picture is performed only within the GOP. This method inhibits prediction across GOPs thereby to eliminate the relationship of compressed data by prediction between GOPs thereby to allow re-splicing of compressed data in a unit of a GOP (determination of a range within which re-encoding is to be performed).
In particular, the range for re-encoding is determined in a unit of one GOP including an editing point for data of the compressed material image 1 and data of the compressed material image 2 which are an object of editing, and the data of the compressed material image 1 and the data of the compressed material image 2 which are an object of editing within the re-encoding ranges determined in a unit of one GOP are decoded to produce signals of the non-compressed material image 1 and the non-compressed material image 2. Then, the signal of the non-compressed material image 1 and the signal of the non-compressed material image 2 are spliced each other at the cut editing point, and the material image 1 and the material image 2 spliced together in this manner are partly re-encoded to produce compressed image data. Then, the compressed image data are spliced with the compressed video data of the portions which have not been re-encoded thereby to produce compressed edited image data.
Actually encoded data are arrayed in a coding order as illustrated in FIG. 3, and splicing of compressed image data is performed in the coding order. Referring to FIG. 3, the compressed image data produced by partially re-encoding the material image 1 and the material image 2 spliced together and the compressed image data which have not been re-encoded are spliced at a B13 picture which is the last picture in the coding order in the data of the compressed material image 1 in the portion which has not been re-encoded and is the fourteenth picture in the display order and an I2 picture which is the first picture in the coding order in the compressed image data produced by the re-encoding and is the third picture in the display order. Further, a B12 picture which is the last picture in the coding order in the compressed image data produced by the re-encoding and is the thirteenth picture in the display order and the I2 picture which is the first picture in the coding order in the data of the compressed material image 2 in the portion which has not been re-encoded and is the third picture in the display order are spliced each other. In other words, the compressed image data produced by re-encoding of the material image 1 and the material image 2 spliced together and the compressed image data in the portion which has not been re-encoded are connected at GOP changeover portions to produce compressed edited image data.
On the other hand, a GOP structure which does not have the Closed GOP structure, that is, a Long GOP structure where an image is referred to across GOPS, is called Open GOP.
Also a technique for splicing two bit streams of the Open GOP structure while preventing otherwise possible deterioration of the picture quality at splicing portions when bit streams of MPEG encoded pictures having the Open GOP structure are spliced together is available. When two bit streams of the Open GOP structure are edited, or more particularly when a bit stream Y is inserted into another bit stream X, a B picture preceding to an I picture which forms the first GOP of the bit stream Y (a B structure which appears before an I picture is displayed) is deleted and the temporal references of the remaining pictures which form the GOP are changed so that the B picture prior to the I picture which is predicted using a picture which forms the last GOP of the bit stream X may not be displayed to prevent such deterioration of the picture quality (referred to, for example, Patent Document 1).
[Patent Document 1]
Japanese Patent Laid-Open No. Hei 10-66085
Disclosure of the Invention
Problems to be Solved by the Invention
However, according to the editing method wherein the Closed GOP structure wherein prediction across GOPs is inhibited as described hereinabove with reference to FIGS. 2 and 3 is utilized, limitation is applied to the prediction direction at a starting portion and an ending portion of a GOP.
On the other hand, the technique disclosed in Patent Document 1 has a problem in that, since a B picture at a splicing portion is not displayed, the picture misses as much.
Further where editing of a compressed image signal compressed in accordance with the Open GOP system by which a high compression efficiency is obtained and produced using bidirectional interframe prediction is to be performed, a buffer must be prevented from breaking down by observing a restriction to the VBV buffer. However, the picture quality must not be deteriorated as a result of observation of a restriction to the VBV buffer.
The present invention has been made in view of such circumstances as described above and makes it possible to prevent deterioration of the picture quality by executing editing of compressed image signals compressed in accordance with the Open GOP system of a Long GOP, by which a high compression efficiency is obtained, and formed using bidirectional interframe prediction by allocating an optimum generation code amount while observing restrictions to a VBV buffer.
[Means for Solving the Problems]
According to a first aspect of the present invention, there is provided an information processing apparatus for executing a process of splicing first compressed image data with second compressed image data, including decoding means for performing a decoding process for a first decoding interval including a first editing point set to the first compressed image data to produce a first non-compressed image signal and performing a decoding process for a second decoding interval including a second editing point set to the second compressed image data to produce a second non-compressed image signal, re-encoding means for performing a re-encoding process for a predetermined re-encoding interval of a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce third compressed image data, control means for setting the re-encoding interval extended from a basic encoding interval based on a generation code amount upon the re-encoding process by the re-encoding means to control the re-encoding process by the re-encoding means, and editing means for alternatively outputting compressed image data in an interval for which the re-encoding process is not performed from within the first and second compressed image data and the third compressed image data produced by the re-encoding by the re-encoding means to produce edited editing compressed image data.
The control means may control allocation of a generation code amount in the re-encoding process in the re-encoding interval by the re-encoding means, and set, where a generation code amount to be allocated to the basic encoding interval from within the re-encoding interval is smaller than a predetermined amount, the re-encoding interval extended from the basic encoding interval to control the re-encoding process by the re-encoding means.
The control means may control the allocation of the generation code amount in the re-encoding process in the re-encoding interval by the re-encoding means so that the generation code amount to be allocated to the basic encoding interval becomes greater than the predetermined amount.
The control means may control the allocation of the generation code amount in the re-encoding interval based on a difference value in occupancy between a start point and an end point of the re-encoding interval so that the code amount to be allocated to the reference encoding range may be increased.
The control means may control the allocation of the generation code amount in the re-encoding interval based on a value which increases in proportion to the number of pictures in the reference encoding interval but increases substantially in inverse proportion to the number of pictures in the re-encoding interval so that the code amount to be allocated to the reference encoding range may be increased.
The control means may control the allocation of the generation code amount in the re-encoding interval so that the generation code amount to be allocated to any interval other than the reference encoding interval from within the re-encoding interval may be decreased.
Where an effect is to be performed for the reference encoding range, the control means may set the re-encoding interval extended from the basic encoding range in response to a type of the effect to be performed for the reference encoding interval to control the re-encoding process by the re-encoding means.
The control means may set the re-encoding interval extended from the basic encoding interval based on a rate of rise of a degree of difficulty in encoding of the reference encoding range to control the re-encoding process by the re-encoding means.
Where an effect is to be performed for the reference encoding range, the control means may control the allocation of the generation code amount in the re-encoding process in the re-encoding interval by the re-encoding means in response to a type of the effect to be performed for the reference encoding range so that the generation code amount in the reference encoding range may be increased.
The control means may control the allocation of the generation code amount in the re-encoding process of the re-encoding interval by the re-encoding means based on a rate of rise of a degree of difficulty in encoding of the reference encoding range so that the generation code amount in the reference encoding range may be increased.
The control means may acquire information regarding the occupancy of the first and second compressed image data and control the allocation of the generation code amount in the re-encoding process in the re-encoding interval by the re-encoding means based on the information regarding the occupancy.
The information regarding the occupancy may be information regarding the occupancy of pictures corresponding to top and end positions of the re-encoding interval.
The information regarding the occupancy may be multiplexed in a user data region of the first and second compressed image data, and the control means may acquire the information regarding the occupancy multiplexed in the user data region of the first and second compressed image data.
The control means may acquire information with which an apparatus which has encoded pictures corresponding to top and end positions of the re-encoding interval from within the first and second compressed image data in the past can be specified, and detect a position at which the information regarding the occupancy is described using the acquired information with which the apparatus can be specified.
The control means may acquire information indicative of a picture type of pictures corresponding to top and end positions of the re-encoding interval from within the first and second compressed image data, and detect a position at which the information regarding the occupancy is described using the acquired information indicative of the picture type.
The control means may decide whether or not the first and second compressed image data are format converted and acquire, where it is decided that the first and second compressed image data are format converted, information indicative of a picture type of pictures corresponding to top and end positions of the re-encoding interval and then detect a position at which the information regarding the occupancy is described using the acquired information indicative of the picture types.
The information regarding the occupancy may be recorded in an associated relationship with the first and second compressed image data on a predetermined recording medium, and the control means may acquire the information regarding the occupancy from the recording medium.
The information processing apparatus may further include acquisition means for acquiring a code amount of the first compressed image data in the proximity of a start point of the re-encoding range and the second compressed image data in the proximity of an end point of the first range, analysis means for analyzing, based on the code amounts acquired by the acquisition means, a first locus of a virtual buffer occupation amount where it is assumed that an occupation amount of a virtual buffer at the start point reaches a lower limit value when the re-encoding process is performed for the first compressed image data in the proximity of the start point and analyzing a second locus of the virtual buffer occupation amount where it is assumed that the occupation amount of the virtual buffer at a picture next to the end point reaches an upper limit value when the re-encoding process is performed for the second compressed image data in the proximity of the end point, and determination means for determining, based on the first and second loci analyzed by the analysis means, the upper limit value to the occupation amount of the virtual buffer at the start point and the lower limit value to the occupation amount of the virtual buffer at the end point when the first range is re-encoded.
The determination means may determine an occupation amount of the virtual buffer at the start point in a third locus obtained by correcting the first locus in a direction in which the occupation amount of the virtual buffer increases by a code amount of a maximum underflow in a region which is not included in the re-encoding range from within the first locus as an upper limit value to the occupation amount of the virtual buffer at the start point when the re-encoding process is performed for the re-encoding range.
The determination means may determine an occupation amount of the virtual buffer at the end point in a third locus obtained by correcting the second locus in a direction in which the occupation amount of the virtual buffer decreases by a code amount determined from a period of time in which the occupation amount of the virtual buffer reaches a maximum value in a region which is not included in the re-encoding range from within the second locus and an integrated value of maximum bit rates as a lower limit value to the occupation amount of the virtual buffer at the end point when the re-encoding process is performed for the re-encoding range.
An information processing method and a program according to the first aspect of the present invention include a re-encoding interval setting step of setting a re-encoding interval extended from a basic encoding interval based on a generation code amount upon a re-encoding process, a decoding step of performing a decoding process for a first decoding interval including a first editing point set to the first compressed image data to produce a first non-compressed image signal and performing a decoding process for a second decoding interval including a second editing point set to the second compressed image data to produce a second non-compressed image signal, a re-encoding step of performing a re-encoding process for the re-encoding interval set by the process at the re-encoding interval setting step in a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce a third compressed image data, and an editing step of alternatively outputting compressed image data in an interval for which the re-encoding process is not performed from within the first and second compressed image data and the third compressed image data produced by the re-encoding process through the process at the re-encoding step to produce edited editing compressed image data.
In the first aspect of the present invention, a re-encoding interval extended from a basic encoding interval is set based on a generation code amount upon a re-encoding process, and a decoding process for a first decoding interval including a first editing point set to the first compressed image data is performed to produce a first non-compressed image signal. Further, a decoding process for a second decoding interval including a second editing point set to the second compressed image data is performed to produce a second non-compressed image signal. Then, a re-encoding process is performed for the re-encoding interval set in a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce a third compressed image data. Then, compressed image data in an interval for which the re-encoding process is not performed from within the first and second compressed image data and the third compressed image data produced by the re-encoding process are alternatively outputted to produce edited editing compressed image data.
According to a second aspect of the present invention, there is provided an information processing apparatus for executing a process of splicing and re-encoding first compressed image data with second compressed image data, including decoding means for performing a decoding process for a first decoding interval including a first editing point set to the first compressed image data to produce a first non-compressed image signal and performing a decoding process for a second decoding interval including a second editing point set to the second compressed image data to produce a second non-compressed image signal, re-encoding means for performing a re-encoding process for a predetermined re-encoding interval of a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce third compressed image data, and control means for setting the re-encoding interval extended from a basic encoding interval based on a generation code amount upon the re-encoding process by the re-encoding means to control the re-encoding process by the re-encoding means.
An information processing method and a program according to the second aspect of the present invention include a re-encoding interval setting step of setting a re-encoding interval extended from a basic encoding interval based on a generation code amount upon the re-encoding process, a decoding step of performing a decoding process for a first decoding interval including a first editing point set to the first compressed image data to produce a first non-compressed image signal and performing a decoding process for a second decoding interval including a second editing point set to the second compressed image data to produce a second non-compressed image signal, and a re-encoding step of performing the re-encoding process for the re-encoding interval set by the process at the re-encoding interval setting step in a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce a third compressed image data.
In the second aspect of the present invention, a re-encoding interval extended from a basic encoding interval is set based on a generation code amount upon the re-encoding process, and a decoding process is performed for a first decoding interval including a first editing point set to the first compressed image data to produce a first non-compressed image signal. Further, a decoding process is performed for a second decoding interval including a second editing point set to the second compressed image data to produce a second non-compressed image signal. Then, the re-encoding process is performed for the re-encoding interval set by the process at the re-encoding interval setting step in a third non-compressed image signal wherein the first and second non-compressed image signals are spliced at the first and second editing points to produce a third compressed image data.
[Effects of the Invention]
According to the first aspect of the present invention, compression encoded data can be edited, and particularly since the range for re-encoding can be extended based on a generation code amount provided to the reference encoding range, deterioration of the image in the proximity of the editing point can be prevented without allowing the buffer to fail.
According to the second aspect of the present invention, compression coded data can be spliced and re-encoded, and particularly since the range for re-encoding can be extended based on a generation code amount provided to the reference encoding range, deterioration of the image in the proximity of the editing point can be prevented without allowing the buffer to fail.