1. Field of the Invention
This invention relates to a multimedia information receiving apparatus for receiving multimedia information prepared by multiplexing an audio signal and a moving image signal (moving picture signal or video signal) that are related to each other and compressed separately, separating and expanding the audio signal and the moving image signal and reproducing them synchronously. More particularly, the present invention relates to an improvement to a moving image decoding and reproducing apparatus adapted to be used as multimedia information receiving apparatus, a moving image decoding and reproducing method, a time control method and a computer program product for decoding and reproducing a moving image.
2. Description of the Related Art
Generally, in a multimedia information transmission system adapted to transmit multimedia information obtained by multiplexing an audio signal and a moving image signal, the audio encoder and the moving image encoder of the transmitter are required to process respectively the sound and the moving image in a synchronized manner while the audio decoder and the moving image decoder of the receiver are required to process the respective output signals so that the sound and the moving image may be reproduced in a synchronized manner. In order to make it possible to reproduce the sound and the moving image synchronously, the international coding standards such as MPEG-1 and MPEG-2 (MPEG: Moving Pictures Experts Group) provide the use of presentation time stamps (PTSs) as information for controlling the timings of signal outputs so that both the audio signal and the moving image signal are reproduced and output when they are synchronized in the respective decoders at a given time as controlled by a system time clock (STC).
For example, when an audio signal and a moving image signal are multiplexed according to MPEG-2, the bit stream obtained by encoding the audio signal and the moving image signal is divided into groups, which are then packetized to produce packets referred to as PES (packetized elementary stream) packets having a variable length. At this time, a PTS is added to each of the PES packets. If a PES packet contains a moving image bit stream of a plurality of frames, only a PTS corresponding to the first frame is added to the PES packet. In other words, none of the remaining frames are provided with a PTS added thereto.
However, with the moving image encoding method of MPEG-2, there scarcely exists a moving image bit stream of a plurality of frames in a PES packet because the moving image bit stream of a frame is very long. Additionally, since any two consecutive frames are separated from each other by a constant interval, it is possible to accurately estimate the display time of each of the remaining frames from the PTS of the first frame if each and every frame is not provided with PTS.
Meanwhile, the operation of stipulating the MPEG-4 Standards for encoding moving images for the purpose of mobile communications (radio communications) at a low transmission rate is under way. The coming MPEG-4 video coding system will provide the use of a time information added to the header of each VOP (video object plane) frame in a bit stream for the purpose of indicating the time for reproducing the VOP. Note, however, the time information is produced not by using the system time clock but by using a clock whose accuracy is specific to video.
Now, let us assume that the moving image bit stream of a frame is packetized into a single PES packet. If the moving image bit stream of a frame is short, the PES packet will also be short. However, the overhead (the additional data added for the purpose of multiplexing) will become large relative to the packet length because of the addition of a PTS of real data of 33 bits corresponding to the frame in the PES packet. Then, the net result will be a lowered overall transmission efficiency.
Then, it may be conceivable to packetize a moving image bit stream of a plurality of frames into a PES packet and add only a PTS corresponding to the first frame in the PES packet. However, according to MPEG-4, two consecutive frames are not necessarily separated by a constant interval. Then, it will not be possible to accurately estimate the display time of each of the remaining frames because only the first frame in the PES packet is provided with PTS.
As pointed out above, with MPEG-4 for mobile communications at a low transmission rate, if a moving image bit stream is PES packetized on a frame by frame basis and a PTS is added to the frame of each packet, the overhead will become too large relative to the packet length. Additionally, if a moving image bit steam of a plurality of frames is put into a single packet and only a PTS added to the heading frame in the packet, it will no longer be possible to accurately estimate the display time of each of the remaining frames.