With the development of digital technology in recent years, multimedia data which have conventionally been processed only in analog form, for example, information including video and audio data, can be converted to digital information by digitization, compressive coding, and multiplexing, and the digital information can be used for recording and transmission. Multimedia data coded and multiplexed information so generated, for example, video/audio coded and multiplexed information, is called "system stream". Usually, a system stream is a one-dimensional array in which digitized, coded, and compressed video information and digitized, coded, and compressed audio information are alternately placed in prescribed units.
FIG. 27 shows an example of video/audio coded and multiplexed information. In the figure, reference numeral 2101 denotes a video sequence header showing that coded video information 2101 follows the header. When the coded (compressed) video information 2102 is subjected to inverse coding (expansion), an image having a reproduction time is generated. Reference numeral 2103 denotes an audio sequence header showing that coded audio information 2104 follows the header. When the coded (compressed) audio information 2104 is subjected to inverse coding (expansion), a voice having a reproduction time is generated. This audio/video coded and multiplexed information is also called "time-division-multiplexed information", and processes for generating this information are a coding process and a multiplexing process.
In an apparatus for reproducing the audio/video coded and multiplexed information, for video/audio synchronization at reproduction, video information and audio information to be reproduced synchronously are accumulated in a buffer and, when the video information and the audio information are completely stored in the buffer, they are decoded and reproduced. Since the capacity (size) of an internal buffer in a reproduction apparatus is limited, the coded video information 2102 and the coded audio information 2104 shown in FIG. 27 must be alternately arranged by quantities within a range allowed by the size of the internal buffer. More specifically, when the size of internal buffer is equivalent to 0.1 sec, the coded video information 2102 and the coded audio information 2104 must be alternately arranged in units that make the reproduction time not longer than 0.1 sec. Depending on the structure of the reproduction apparatus, there is a case where audio information and video information must be arranged alternately by quantities that make the video reproduction time and the audio reproduction time equal to each other.
A description is now given of a conventional apparatus for coding video/audio information and outputting video/audio coded and multiplexed information as mentioned above.
FIG. 26 is a block diagram illustrating a video/audio coding and multiplexing apparatus according to the prior art. With reference to FIG. 26, this apparatus is provided with an image/voice input means 2001, an audio capture means 2002, an audio coding means 2003, a coded audio storage means 2004, a video capture means 2005, a video coding means 2006, a coded video storage means 2007, and a file management means 2008.
The image/voice input means 2001 is implemented by a video camera or the like. This means 2001 receives image and voice, and outputs analog video information and analog audio information, separately. The audio capture means 2002 receives the analog audio information output from the input means 2001, and outputs digital audio information comprising discrete digital data. The audio coding means 2003 receives the digital audio information output from the audio capture means 2002, compresses the information by coding it excluding redundant information, and outputs coded audio information per unit time. In this case, the unit time is 0.1 sec. The coded audio storage means 2004 adds an audio sequence header to the coded audio information output from the audio coding means 2003, and outputs it to the file management means 2008. The video capture means 2005 receives the analog video information output from the image/voice input means 2001, and outputs digital video information comprising discrete digital data. The digital video information is composed of plural pieces of still picture information, each showing a still picture per unit time. The video coding means 2006 receives the digital video information output from the video capture means 2005, compresses the information by coding it excluding redundant information, and outputs coded video information. In this example, coded video information is output in units of individual still pictures. Since a still picture exists every 1/30 sec, the unit time for outputting the coded video data is 1/30 sec. The coded video storage means 2007 adds a video sequence header to the coded video information output from the video coding means 2006, and outputs it to the file management means. The file management means 2008 writes the input audio and video information in a file in a storage unit.
FIG. 27 shows video/audio coded and multiplexed information obtained by the conventional video/audio coding and multiplexing apparatus shown in FIG. 26. As shown in FIG. 27, for the reproduction process performed later, video information and audio information are arranged alternately by quantities that make the video reproduction time and the audio reproduction time equal to each other, and the reproduction time is not longer than 0.1 sec.
FIG. 28 is a diagram for explaining the operation of the video/audio cording and multiplexing apparatus shown in FIG. 26, with respect to the flow of data.
In FIG. 28, the same reference numerals as those shown in FIG. 26 designate the same or corresponding parts, and a video camera is employed as the image/voice input means 2001 shown in FIG. 26.
First of all, the video camera 2001 captures image and voice, and outputs analog audio information and analog video information, separately.
The audio capture means 2002 receives the analog audio information output from the video camera 2001 and outputs digital audio information. On the other hand, the video capture means 2005 receives the analog video information output from the video camera 2001 and outputs digital video information.
The video coding means 2006 receives the digital video information output from the video capture means 2005, compresses the video information by coding, and outputs coded video information corresponding to a single still picture. The coded video storage means 2007 adds a video sequence header to the head of the coded video information. The file management means 2008 stores the coded video information with the sequence header in a file in a storage unit.
The conventional video/audio coding and multiplexing apparatus repeats, three times, the process steps from coding by the video coding means 2006 to storage in the file by the file management means 2008. Since the coded video information is output every 1/30 sec, when the process steps have been repeated three times, the elapsed time is 0.1 sec (1/10).
The audio coding means 2003 receives the digital audio information output from the audio capture means 2002, compresses the audio information by coding, and outputs coded audio information by a quantity equivalent to 0.1 sec. The coded audio storage means 2004 adds an audio sequence header to the head of the coded audio information. The file management means 2008 stores the coded audio information with the sequence header in the file in the storage unit.
As long as the image/voice input through the video camera 2001 continues, the above-mentioned process steps are repeated appropriately, whereby coded video information and coded audio information as shown in FIG. 27, each being equivalent to 0.1 sec, are output as video/audio coded and multiplexed information, and stored in the storage unit.
As described above, in the conventional video/audio coding and multiplexing apparatus, when video information and audio information are coded, the video coding means 2006 and the audio coding means 2003 operate independently, and output coded video information and coded audio information at constant timings, respectively. More specifically, the video coding means 2006 outputs coded video information every 1/30 sec, and the audio coding means 2003 outputs coded audio information every 1/10 sec.
Therefore, video/audio coded and multiplexed information in which video information and audio information are alternately arranged in the same units of reproduction time (1/10sec) is obtained by successively storing output coded information in the file, without using special means for video/audio synchronization. To realize this, the conventional apparatus requires independently operating hardware dedicated to each of the audio coding means and the video coding means.
Hence, it is considerably difficult for the conventional apparatus to implement the video and audio coding means as software programs operating on a multitask operating system using a general purpose CPU, without using hardware.
The reason is as follows. On a multitask operating system, when plural programs (tasks) are operating in parallel, the timing for executing each program is decided by scheduling performed by the operating system or interruption generated by a device driver, so that equal assignment according to the request from the coding means is not guaranteed. Therefore, the above-mentioned constant operation is not always expected.
For example, for a device driver constituting the audio capture means and the video capture means, when an interruption generated by the audio capture means has priority over an interruption generated by the video capture means, there is a possibility that audio coding might be executed prior to video coding in a period of time.
FIG. 29 is a diagram for explaining this problem, and shows an example of video/audio coded and multiplexed information obtained as a result of video/audio processing when the conventional apparatus is implemented by the above-mentioned system. In FIG. 29, the ratio of the video reproduction time to the audio reproduction time is shown on the assumption that the video bit rate (quantity per unit time) is always equal to the audio bit rate. In section A, audio coding is carried out prior to video coding, so that the reproduction time of audio information is longer than the reproduction time of video information. In section B, since video coding is carried out so as to make up for the process in section A, the reproduction time of video information is longer than the reproduction time of audio information. In the multiplexed information shown in FIG. 29, since the video/audio multiplexing is unbalanced as a whole, a reproduction apparatus having a sufficiently large buffer for both of video and audio information must be used, otherwise the processing will be complicated, resulting in unwanted phenomena such as video or audio interruption.
Further, in the multitask operating system, since various kinds of programs reside besides the video and audio coding means, even though assignment to the video and audio coding means is performed equally, a problem still remains. For example, it is assumed that, when the audio coding means is expected to output 1/10 sec coded audio information constantly, a program other than the coding means consumes the CPU time over a long period, and 1 sec has elapsed from the previous audio coding when the CPU time is given to the audio coding means. In this case, even though 1 sec of digital audio information is buffered to prevent audio interruption, this audio information is output as a block of coded audio information for 1 sec. When this is processed by the conventional apparatus, in obtained video/audio coded and multiplexed information, 1 sec of audio information is inserted whereas video information and audio information must be alternately arranged in time units of 0.1 sec. When the capacity of the internal buffer in the reproduction apparatus is only 0.1 sec, the 1 sec of audio information results in a fatal event that image and voice are interrupted due to overflow of information from the buffer.
FIG. 30 is a diagram for explaining the problem in more detail, and shows an example of video/audio coded and multiplexed information obtained as a result of video/audio processing when the conventional apparatus is implemented by the above-mentioned system. In section A, coded audio information and coded video information are processed at sufficiently short time intervals. However, in section B, since another program, i.e., a process other than video/audio coding, has been executed at the beginning of this section, both the video information and the audio information are increased in time, leading to the above-mentioned problem.
Even when the video and audio coding means are only programs in the multitask operating system, a problem might occur. Generally, when the CPU operation switches to another program (task), a work for task switching is required. The ratio of this work time to the whole increases with an increase in frequency of task switching, resulting in degradation of performance in total. Therefore, in the conventional structure, to obtain video/audio coded and multiplexed information in which video information and audio information are alternately arranged in short time units, frequent task switching between the audio coding means and the video coding means is indispensable, resulting in degradation of performance as mentioned above. When the performance is degraded, there is a possibility that the coding process may not be completed by the next task switching time, and image and voice may be interrupted.