The present invention relates to a replay apparatus, a replay method, a recording medium, and a computer program and, in particular, to a replay apparatus, a replay method, a recording medium, a computer program for precisely determining total play time when music is fast forward replayed or rewound for replaying.
FIG. 1 illustrates the structure of a known replay apparatus 1 that replays music. The replay apparatus 1 includes a demultiplexer 11, an elementary stream (ES) data buffer 12, and a decoder 13.
The replay apparatus 1 receives an input stream, into which audio elementary streams, obtained by encoding pulse code modulated music data, are multiplexed. The input stream is supplied to the demultiplexer 11.
The demultiplexer 11 separates the audio elementary stream from the input stream, and stores the audio elementary stream in the ES data buffer 12. The decoder 13 decodes the audio elementary stream stored in the ES data buffer 12 on a per audio access unit basis, and outputs (replays) the resulting pulse code modulation (PCM) data on a per audio access unit basis.
FIG. 2 illustrates the structure of an audio elementary stream formed of fixed-length encoded data. The audio elementary stream containing fixed-length encoded data includes a plurality of fixed-length frames. Each frame includes a header section and a data section.
The header section is formed of a frame header. The frame header has information representing each frame, and starts with a synchronization bit. Written in the frame header are a sampling frequency and a bit rate used to encode the data of the data section, and information as to the presence or absence of a padding representing data used to align a frame size with a byte unit.
The data section is formed of a main data frame that represents data of one audio access unit that is obtained by fixed-length coding PCM data having a fixed length (a fixed number of samples) on a frame by frame basis.
Since the encoded PCM data has a fixed length in the audio elementary stream containing the fixed-length coded data shown in FIG. 2, the data section of each fixed-length frame is constructed of the main data frame only in that frame. In an audio elementary stream having fixed-length coded data, the number of main data frames from the first frame to the start position of each subsequent frame equals the number of frames from the first frame to the start position of each subsequent frame.
The currently widely used motion picture experts group (MPEG) 1 layer 3 (MP3) employs a variable-length coding method (see non-patent document ISO/IEC 11172-3 International Standard MPEG-1 Audio (1993)). FIG. 3 illustrates the structure of an audio elementary stream constructed of variable-length coded data. The audio elementary stream constructed of variable-length coded data has a plurality of fixed-length frames. Each frame includes a header section, a side information section, and a data section.
The header section contains a frame header. The frame header, having a constant duration of time, starts with a synchronization bit. Written in the frame header are a sampling frequency and a bit rate used to encode the data of the data section, and information as to the presence or absence of a padding representing data used to convert a frame size into units of bytes.
Written in the side section information is a start position of a main data frame of each frame.
Each data section contains a main data frame representing data that is obtained by variable-length coding one audio access unit PCM data having a fixed length (i.e., having a predetermined number of samples) with respect to each frame. For example, the data that is obtained by variable-length coding the PCM data having a fixed length in connection with a frame #1 is a main data frame #1. The data that is obtained by variable-length coding the PCM data having a fixed length in connection with a frame #2 is a main data frame #2. The data that is obtained by variable-length coding the PCM data having a fixed length in connection with a frame #3 is a main data frame #3.
Since the main data frame in the audio elementary stream containing variable-length coded data is variable in length as shown in FIG. 3, the size of the data section is different from the size of the main data frame. The data section of the frame is not necessarily formed of the main data frame only. As shown in FIG. 3, for example, the data section of the frame #1 contains the main data frame #1 for the frame #1, the main data frame #2 for the frame #2, and a portion of the main data frame #3 for the frame #3.
In an audio elementary stream formed of variable-length coded data, the number of frames from the first frame to the start position of each of the frames is different from the number of main data frames from the first frame to the start position of the same frame.
The fast-forward replay operation and the rewind and replay operation performed by the replay apparatus 1 of FIG. 1 will now be discussed with reference to FIG. 4.
The fast-forward replay operation and rewind and replay operation are performed on a per frame basis. When a user commands the apparatus 1 to replay a frame, N frames behind a frame #3, during the replay operation of the frame #3 as shown in FIG. 4, the decoder 13 of the replay apparatus 1 jumps to a frame #(N+3) from the frame #3 and replays the frame #(N+3). Here, N is a positive or negative integer. If N is a positive integer, the music is fast-forwarded. If N is a negative integer, the music is rewound and replayed.
To display total play time on a display (not shown) during the fast-forward replay operation or the rewind and replay operation of the music, the replay apparatus 1 must calculate the play time from the first frame #1 and the start position of the frame #(N+3) to which the replay apparatus 1 has jumped.
As already shown in FIG. 2, the number of main data frames from the first frame to the start position of each subsequent frame equals the number of frames from the first frame to the start position of the same frame in the audio elementary stream having the fixed-length coded data. When one of the fast-forward replay operation and the rewind and replay operation is performed, the replay apparatus 1 detects the exact number of main data frames contained between the first frame #1 and the frame #(N+3) as a jump target.
More specifically, the replay apparatus 1 numbers, in succession, the frames from the first frame to subsequent frames and then detects the frame number #(N+3) as the jump target, thereby detecting (N+2) as the number of main data frames contained in the first frame #1 to the frame #(N+3) as the jump target. Based on the number of main data frames (N+2) and the sampling frequency described in the frame header, the replay apparatus 1 calculates total play time from the first frame #1 to the frame #(N+3) as the jump target.
However, in an audio elementary stream formed of variable-length coded data, the number of main data frames from the first frame to the start position of each of the frames is not necessarily equal to the number of frames from the first frame to the start position of the same frame. The replay apparatus 1 has difficulty in detecting the exact number of main data frames contained in the first frame #1 to the frame #(N+3) as the jump target during the fast-forward replay operation or the rewind and replay operation. As a result, no exact total play time is calculated. Each time the fast-forward replay operation or the rewind and replay operation is performed, the displayed total play time varies in time.
It is particularly difficult for the replay apparatus 1 to calculate the total play time on the order of the audio access unit, namely, on the order of tens of milliseconds.