The present invention relates to data streams. In particular, the present invention relates to compressing data streams and the presentation of decompressed data streams.
When information, such as audio, video, image or data, is sent by a transmitting system to a receiving system, the information is typically sent in a compressed format for reasons of efficiency and transmission speed. Frequently, one aspect of compression is the removal of silence or delay in the information. The information is also typically stored at the transmitting system and/or receiving system in a compressed format. When the information is received by the receiving system, the information is then typically decompressed for presentation to the intended user(s).
Many compression/decompression schemes exist for efficiently sending audio messages. When silence is encountered in an audio message to be transmitted, the silence is typically stripped out of the audio message, which may be compressed. Conventional audio compression/decompression schemes include Adaptive Differential Pulse Code Modulation (ADPCM), Motion Picture Expert Group (MPEG) audio compression, Global System for Mobile Communication (GSM), and G.723.
In video messages, image compression/decompression issues typically need to be addressed in order to overcome data rate and storage size issues created by full-motion video. For example, frame skipping can be used to avoid storing and transmitting repeated frames when there is no change in the video picture from one frame to the next. Conventional video compression/decompression techniques include run-length coding, Huffman coding, vector quantization, subcolor sampling, discrete cosine transform (DCT), delta frame change, and motion estimation.
If the information to be sent is first compressed, data transmission speed can be increased substantially. Many data types and files are easily compressed because of the repetitive nature of their contents. Data compression can be achieved in many different ways. One common method is to use a special shorthand notation for transmitting data. If a certain character is sent frequently, the data compression devices may send an abbreviated form of the character. Another method is to send only changes to the data. Other examples of data compression/decompression methods include V.42bis and Microcom Network Protocol 5 (MNP5), which work well for real-time compression of a stream of data. As another example, a data file such as a file for a PowerPoint(copyright) presentation may be compressed first with an algorithm such as that used in Zip files.
Although single media type (e.g., only video, only data, or only audio) messages can be easily compressed and decompressed for single media presentations to users, the compression and decompression of multimedia presentations can cause some synchronization problems at the receiving end. Multimedia presentations typically contain some combination of audio, video, image, and data. Modem presentation systems, which may include voicemail messaging systems, video conferencing systems, and data presentation systems such as electronic mail (e-mail), may combine these capabilities into a multimedia presentation system. Ideally, the multiple different media data streams of a multimedia presentation are stored separately and transmitted individually with compression, thus enabling the efficient compression of each media data stream using compression techniques optimal for the particular type of media data stream. This system would also permit use of an individual media data stream separate from the other media data streams in the multimedia presentation. However, such a system would experience multimedia synchronization problems. The advantages and disadvantages of such a system are illustrated using an exemplary multimedia presentation like a chief executive officer""s (CEO""s) speech to shareholders that may include a video, a set of still images such as a PowerPoint presentation, and audio. If only a portion of the multimedia presentation is desired to be reviewed, then a single media stream such as an audio clip of a chief executive officer""s (CEO""s) speech to shareholders can be downloaded to a branch office for later playback to employees. For transmission of that audio clip, the messaging system performs the most optimal compression (in this case, normal audio compression) and removes long gaps of silence from the message. Later if the branch manager decides he would prefer to receive the video version, he can separately download the video stream, which is stored in the messaging system, which is transmitted using the optimum video compression. Advantageously, the separately stored and transmitted media data streams of a multimedia presentation can be separately downloaded as needed without transmitting the entire multimedia presentation in order to view only one of the media data streams. However, when he plays back both video and audio together, the audio and video are no longer synchronized. Since the messages were stored and transmitted independently, this system optimizes its compression without regard to time synchronization. As gaps in the voice may have occurred while the CEO was making gestures captured on video, the two data streams when played back are no longer synchronized. If a third data stream, containing the slides presented by the CEO were downloaded, they would also have no synchronization information. The video and audio may not be synchronized when a delay, such as a pause, occurs in the video. Further, if the set of images, such as pie graphs, is shown during the video pause and the audio is not synchronized to the set of images, then the audio may continue to discuss the first image after the first image has been replaced by a second image. These types of synchronization problems can result in serious confusion and poor performance in multimedia messaging systems.
One possible solution to the multimedia synchronization problem is to time stamp each data packet in each simultaneously transmitted single media data stream of the multimedia presentation to allow for multimedia synchronization. In particular, when the various data streams of a multimedia presentation, such as the audio and video, are transmitted at the same time, most multimedia protocols typically use some form of time stamping to mark each packet of the information such that the audio and video can be synchronized later. By matching time stamps of the packets of each single media data stream, these time-matched packets can be simultaneously presented in a synchronized multimedia presentation.
However, this approach of time stamping each data packet for each data stream of a multimedia presentation may suffer from various problems. One problem is that time stamping, especially for audio portions of a multimedia presentation, typically involves high overhead. For example, in an audio portion of a multimedia message, the portion of a stored message that consists of time stamps could easily be greater than the silence removed from the audio message, demonstrating the potentially high overhead of time stamps in some situations. As a more specific illustration of the time stamp overhead problem, it is noted that for the typical 66-byte IP voice packet, 5-10% of the stored voice message might be time stamps; whereas, the typical voice message might only have 2-3% of the total message time compressed out of it due to silence removal. Thus, the storage of time stamped messages would result in a higher overhead cost than the benefit of silence removal that can be achieved. Another problem is that time stamping each media data stream can be wasteful, especially for uncompressed data streams. For example, if a portion of an audio stream is not compressed, the time stamping of the uncompressed portion of the audio stream results in a further waste of memory resources. Further, if all the data streams are stored with time stamps but the user only desires to listen to the audio portion of the stored multimedia presentation, the user is confined by the time stamp synchronization and is forced to listen to long gaps of silence in the audio portion. Being forced to listen to this silence is thus inefficient and wastes time for the user who is not utilizing the audio with, for example, the synchronized video of the multimedia presentation. A further problem of time stamp synchronization of multimedia presentations is that the time stamps are rendered useless when a multimedia presentation, or a portion thereof, is reused to create a different multimedia presentation. For example, a PowerPoint and video presentation may be prepared by an assistant while a manager may give a final presentation with his own narration over the PowerPoint and video presentation. In this case, time stamp synchronization on the original presentation would not be beneficial for the final multimedia presentation.
Although receiving all the data streams of a multimedia presentation may provide synchronization information, always needing to receive all the streams at once may be wasteful. A receiver may already have one data stream, such as foils from a presentation, but later need the audio. It would be wasteful to download the foils again when the receiver already has them.
Thus, what is needed is an efficient system and method to preserve the timing of the original message, without adding the high overhead of time stamping each packet in the multimedia message. Without the time stamp overhead, the space required to store the multimedia messages or streams would be substantially reduced. It would also be desirable to simplify the storage of multimedia streams by storing the different streams individually and in their native form while providing timing and synchronization information for the purpose of reconstruction of the original multimedia message.
The present invention relates to a system and method for efficient and automatic synchronization in multimedia presentations. According to an embodiment of the present invention, when a data stream is compressed, delay which would normally be compressed out is replaced by a delay token which indicates a length of time of delay. When the data stream is decompressed and presented, it is optional whether to use or ignore the delay tokens. The delay tokens may be used when data streams are presented together in a multimedia presentation to synchronize the various data streams of the multimedia presentation. Otherwise, when the data streams are presented alone, the delay tokens may be ignored such that data stream delay is simply skipped since there is no need to synchronize with other data streams.
According to an embodiment of the present invention, a method for compressing a data stream is presented. The method includes compressing a data stream that is part of a multimedia presentation. The method also includes stripping a delay from the data stream, and using a delay token in place of the delay in the compressed data stream.
In another embodiment, the present invention provides a method for decompressing a data stream. The method includes decompressing a data stream, and determining if the data stream is part of a multimedia presentation. The method also includes reinserting a delay specified by a delay token if the data stream is part of the multimedia presentation.
In yet another embodiment, the present invention provides a system for compressing a data stream. The system includes a processor configured to compress a data stream that is part of a multimedia presentation, to strip a delay from the data stream, and to use a delay token in place of the delay in the compressed data stream. The system also includes a memory coupled to the processor for storing the compressed data stream.
In yet another embodiment of the invention, a system for decompressing a data stream is presented. The system includes a processor configured to decompress a data stream, to determine if the data stream is part of a multimedia presentation, and to reinsert a delay specified by a delay token if the data stream is part of the multimedia presentation. The system also includes a user interface coupled to the processor for presenting the decompressed data stream.