A technology for recording a transport stream of video data that has been encoded in accordance with MPEG2 (Moving Picture Experts Group Phase 2) to a recording medium for example an optical disc is described in for example Patent Document 1 (Japanese Patent Laid-Open Publication No. 2002-158972).
FIG. 1 shows a picture format of a frame structure of which a video signal is encoded in accordance with the MPEG. In FIG. 1, dirk stripes represent lines of a top field (top_field), whereas white stripes represent lines of a bottom field (bottom_field). In the NTSC transmission picture signal format having an aspect ratio of (4:3), one frame has a total of 480 lines of 240 lines of the top field and 240 lines of the bottom field. The number of pixels in the horizontal direction is 704 pixels. One bit flag top_field_first of header information of the picture layer represents which of the top field and bottom field is chronologically first displayed. When top_field_first=1, the top field is chronologically first displayed.
FIG. 2 shows a spatial relation between the format of an MPEG decoded picture and the format of a transmission picture. The format of the transmission picture is the NTSC format having an aspect ratio of 4:3. An effective pixel area (pixel area of an MPEG decoded picture) of one frame is composed of 780 pixels×480 lines. The transmission picture format includes non-effective areas of a horizontal blanking area and a vertical blanking area.
In addition to the foregoing flag top_field_first, another flag repeat_first_field is also transmitted. The flag repeat_first_field is a flag that represents that there is a repeat field. A film material such as a movie is data composed of 24 frames per second. In contrast, a video signal for example an NTSC format video signal has a format of 30 frames pre second. Thus, when a film material is converted into a video signal, a process for generating 30 frames using 24 frames is required. Such a process includes a process for converting two fields into three fields in accordance with a predetermined conversion pattern. Thus, such a process is generally referred to as 2:3 pull down. In other words, the first field is automatically and repeatedly generated twice every five frames. As a result, 24 frames are converted into 30 frames.
When a video signal that has been obtained by the foregoing 2:3 pull down process is compressed in accordance with the MPEG, since information of fields (repeat fields) that have been inserted for increasing the number of frames is redundant, the video signal is encoded so that the repeat fields are removed and the compression efficiency is improved. A process for detecting repeat fields of video data of which 24 frames per second are converted into 30 frames per second by the 2:3 pull down process, removing the repeat fields, and decreasing the number of frames to 24 frames per second is referred to as inverse 2:3 pull down process.
Next, with reference to FIG. 3, the process for converting a film material of 24 frames per second into an NTSC format television material of 30 frames per second, namely the 2:3 pull down process, will be described. A film material is composed of two frames per second. Two fields (first and second fields) of the same picture are composed of each field of the film material. As a result, a picture signal of 48 fields per second is generated. Thereafter, four frames (eight fields) of the film material are converted into five frames (10 fields) of a video signal, for example an NTSC format video signal.
In FIG. 3, a chronologically last field of three fields surrounded by a trapezoid is a field that is repeated to increase the number of fields, namely a repeat first field. The repeat first field takes place twice every five frames. The video signal for which the 2:3 pull down process has been performed is accompanied by two flags top_field_first and repeat_first_field. In the frame first structure, the flag top_field_first is a flag that represents whether the first field is top or bottom. The flag repeat_first_field is a flag that represents that there is a repeat field.
As described above, when a video signal is encoded in accordance with the MPEG2 and the frame frequency of the NTSC format is 29.97 Hz, values of the two flags top_field_first and repeat_first_field are set for each picture. In addition, frame_rate of the sequence header is set for 29.97 Hz.
As another television format, PAL format is also known as well as the NTSC format. The PAL format whose aspect ratio is (4:3) has a frame frequency of 25 Hz and a structure of which one frame is composed of 720 (pixels)×576 (lines). In the PAL format, basically, top_field_first=1, repeat_first_field=0, and frame_rate of the sequence header=25 Hz are set. In other words, a top field and a bottom field are made of one frame of a movie. The obtained video signal is recorded on a recoding medium. Thus, in the PAL format, the reproduction speed of the video signal is faster than that of the original movie by 25/24 times.
As described above, in a standard resolution format, the NTSC format is different from the PAL format in the picture size and the frame rate. However, in for example a high resolution (HD: High Definition) format, the picture size of the NTSC format is the same as that of the PAL format. Thus, when a movie source is converted into a video signal in each format, it is necessary to convert only the frame rate. These two formats whose frame rates are different and whose picture sizes are common are referred to as NTSC range and PAL range.
Conventionally, the format of an original video signal converted into an NTSC video signal was different from the format of an original video signal converted into a PAL video signal. Thus, to author a recording medium on which for example a movie source is recorded, video signals that can be suitably converted into both the formats should be prepared. Thus, it was laborious to handle video signals in both the formats.
Recently, a progressive format display monitor has been used. So far, in the NTSC format, it was difficult to convert a 29.97 Hz interlaced moving picture into a 59.94 (=2×30×(1000/1001)) Hz progressive moving picture and display the converted picture. Since a video signal for which the 2:3 pull down process had been performed may have been irregularly encoded, it was not easy to detect a progressive frame from a decoded moving picture of an MPEG2 video stream.
In addition, in the PAL format, a movie source of 24 frames per second is fast reproduced at a frame rate of 25 Hz. Thus, the reproduction speed of a video signal is faster than that of an original movie by 25/24 times. As a result, the pitch of audio becomes high.
Thus, an object of the present invention is to provide an information processing apparatus and method, a program, and a recording medium that allow encoding to be performed in common with the NTSC range and the PAL range, a moving picture of the NTSC range to be easily converted into a 59.94 Hz progressive moving picture, and a reproduction speed of the PAL range to be prevented from being increased by 25/24 times over that of the original.