This invention relates generally to the field of digital audio, video, and multimedia. More specifically, this invention relates to inserting audio or video effects into a digital audio or video stream.
Digitally-formatted audio (“digital audio”) is becoming more popular because of its high quality, its use with computers and compact audio players, its ease of manipulation and duplication, and its ability to be shared by many people. Some digital audio formats include uncompressed formats such as Audio Compact Disc (PCM (pulse-code modulation) 16-bit/44.1 kHz) and Wave (name extension WAV (.wav)), and compressed formats such as MPEG (Moving Pictures Expert Group) layers 1, 2, 3 (MP3), MPEG-4, DTS® (Digital Theater Systems), and Dolby® Digital.
For many of the same reasons, movies and other video broadcasts are also increasingly being transmitted digitally. Thus, a digitally-formatted movie that is shown in a cinema or delivered to a user's television set from a broadcaster (such as a cable broadcaster) will include a digital video stream for the pictures and a digital audio stream for the soundtrack. One digital audio format used in movies is AC-3, which encodes multichannel audio. (Dolby® Digital audio is in Dolby® AC-3 format.) AC-3 is a compressed format (using perceptual coding) and can be broadcast in two-channel stereo, “5.1”-channel, or “7.1”-channel formats. The latter two formats are used in surround sound (e.g., Dolby® Surround Digital or Dolby® Surround AC-3). The 5.1-channel format includes left front, right front, and center front channels, left and right surround sound channels, and a low frequency effects channel (the “0.1”) having one-tenth the bandwidth of the other channels. The 7.1-channel format is analogous to the 5.1-channel format, but includes two more main channels.
As television becomes more advanced, other applications can be integrated into the viewing experience. Innovations such as WebTV®, online shopping, electronic program guides, and TiVo® (personal video recorders) allow the television to be used for more than just watching TV. While a movie or song is being broadcast, it may be desirable to send an audible signal to the viewer or listener. One example of such a signal could be an alert from an application such as the America Online (“AOL”) client that the viewer, if watching a movie on a home television, has received a new e-mail message. Other examples include a feedback sound (as part of the user interface) that alerts the user of the acceptance of some request (such as by using a remote control), or other sound effects or sound bites that may be used to signal the viewer or listener.
Conventionally, during an audio (analog or digital) broadcast, such a sound effect could be broadcast to the user directly, by mixing the sound effect with the soundtrack (primary stream). However, such mixing has several drawbacks. First, mixing of an analog or uncompressed digital sound effect requires mixing components, and mixing of a compressed digital sound effect destined for subsequent decoding requires decompressing (decoding) both the sound effect and the primary stream, mixing the uncompressed signals together, and recompressing (re-encoding) the mixed signal prior to its transmission to the target decoder. This can result in poor sound quality and/or a loss of synchronization between the audio and video. Second, where a set-top box (“STB”) is used to receive programming, decoding advanced digital audio formats such as AC-3 is usually left to dedicated equipment (e.g., home theater equipment) external to the set-top box. Some means is therefore required for transporting to the external equipment the digital audio data for both the primary stream and the sound effect, adding cost and complexity to the set-up.
Other methods for adding data to a data stream have been disclosed. For instance, U.S. Pat. No. 6,034,746 discloses a system, method, and computer readable medium for inserting additional data, such as commercials, into a digital audio/visual data stream. That system, however, is designed for inserting additional data having attributes different from those of the primary data stream. As such, the system is complex and requires the decoder/receiver to be reinitialized with the attributes of the primary data stream after the additional data stream is played.