1. Field of the Invention
The present invention relates to a signal processing apparatus and method, and particularly relates to the signal processing apparatus and method which are preferably used to process moving image stream data in which compression encoding is performed.
2. Related Background Art
Recently, with progress of digital signal processing technology, a large amount of digital information such as a moving image, a still image, and sound can be encoded at high efficiency to record the encoded information in a small magnetic medium or a small optical medium or to transmit the encoded information through a communication medium. Further, such technology is applied to development of an image pickup apparatus which can easily take a high-quality video image to immediately output the video image to an information medium.
Particularly, an MPEG encoding technology is used in the recent moving image encoding. In the MPEG encoding, an encoding rate can be largely decreased by an intra-frame encoding method in which the encoding is performed by a correlation within a picture and an inter-frame encoding method in which the encoding is performed by the correlation between the preceding picture and the succeeding picture. Therefore, the MPEG encoding is widely used in a video image reproducing apparatus represented by a DVD video player and the image pickup apparatus such as a video camera.
In a television standard in Japan and the United State, a frame rate of a video signal is defined as about 30 frames per second.
On the other hand, the frame rate of the video image of the film material used in a movie usually has about 24 frames per second.
Therefore, in order to treat the video image having 24 frames per second of the film material by a video apparatus having a video standard, there is a well-known technology in which the MPEG encoding is performed by converting the video image having 24 frames per second into the signal having about 30 frames per second (for example, see Japanese Patent Application Laid-Open No. 2000-41244).
A method called 2-3 pull-down is well known as the technology in which the video image having 24 frames per second is converted into the signal having about 30 frames per second. The technology is frequently used when the film material such as the movie is converted into the video image for the television.
FIG. 6 is a view for explaining a 2-3 pull-down process.
In FIG. 6, the reference numerals 601 to 610 denote a state of each field of a video image having about 24 frames per second, and the fields 601 to 610 constitute a field string 600A. One frame in the field string 600A includes two fields having an interlace format. The reference numeral 600B denotes a field string having about 30 frames per second into which the field string 600A having about 24 frames per second is converted.
That is, the conversion is performed by repeatedly converting each two fields of the input video images into two fields of the output video images or three fields of the output images such that fields 601 and 602 of the field string 600A are converted into fields 611 and 612 of the field string 600B and fields 603 and 604 of the field string 600A are converted into fields 613, 614, and 615 of the field string 600B. Five fields of the output video images are generated in each four fields of the input video images by the repetition, which realizes the conversion of 24 frames per second into 30 frames per second.
At this point, in converting the two fields into the three fields, the first field and the third field have the same data. For example, the fields 613 and 615 are generated based on the field 603. Similarly the fields 618 and 620 have the same data.
When the video signal in which the frame rate has been converted is recorded by the MPEG coding, sometimes parameters referred to as “top field first” and “repeat first field” are used in order to remove redundancy of the 2-3 pull-down video image.
When the parameter of the repeat first field is “0”, the two-field configuration is indicated. When the parameter of the repeat first field is “1”, the three-field configuration is indicated. As described above, in the case of the three-field configuration, since the first field is similar to the third field, encoding data is not generated actually for the third field during the MPEG encoding, instead decoding data of the first field is directly output during decoding.
On the other hand, the parameter of the top field first indicates whether a top field or a bottom field is first in the temporal order in the original video signal having 24 frames per second. In the case of the top field first is “0”, the top field first indicates that the bottom field is first. In the case of the top field first is “1”, the top field first indicates that the top field is first.
In such a conventional apparatus, there is a problem that the use of combination of the parameters increases complication of the stream to impose a load on the decoding process. That is, in order to normally decode the generated stream, it is necessary that the parameters are identified and a copy field is inserted on the decoding process side. Therefore, sometimes the normal reproduction cannot be performed in a decoding apparatus or system in which MPEG is partially loaded.
In the simpler conventional method, there is the method of directly encoding the 2-3 pull-down video image. This method has an advantage that no load is generated in the decoding process, because the video image having 30 frames per second after the pull-down is directly encoded as the stream.
However, as described above, when the recording is performed by the MPEG encoding of the video signal having 30 frames per second after the 2-3 pull-down, in the reproduction, it is difficult to re-convert the video signal having 30 frames per second into the video signal having 24 frames per second which has no redundancy before the pull-down.
Because, when the inserted frame is referred to from other frames by the encoding with the inter-image correlation which is unique to MPEG, the data of the frame referring to the inserted frame cannot be decoded when the inserted frame is directly removed, so that the inserted frame cannot be removed.
Accordingly, in order to obtain the video signal having 24 frames per second in the MPEG encoded form from the video signal in which the MPEG encoding is performed after the 2-3 pull-down, it is necessary that the video signal is decoded once while having 30 frames per second, and the encoding is performed again by removing the frame inserted after the decoding. Therefore, there are problems that it takes a very long time to perform the process and image quality is degraded by the re-encoding.