1. Field of the Invention
The present invention relates to the field of multimedia electronic systems. More particularly, the present invention relates to an audio decoder unit for decoding digital multimedia bitstreams representing audio information.
2. Related Art
Audio/visual (AV) material is increasingly stored, transmitted and rendered using digital data. Digital video representation of AV material facilitates its usage with computer controlled electronics and also facilitates high quality image and sound reproduction. Digital AV material is typically compressed (“encoded”) in order to reduce the computer resources required to store and transmit the digital data. The systems that transmit multimedia content encode and/or compress the content to use their transmission channel efficiently because the size of the multimedia content, especially video, is very large. For instance, in order to more efficiently broadcast or record audio signals, the amount of information required to represent the audio signals can be reduced. In the case of digital audio signals, the amount of digital information needed to accurately reproduce the original pulse code modulation (PCM) samples can be reduced by applying a digital compression process, such as AC3, for instance, resulting in a digitally compressed representation of the original sample.
Digital AV material can be encoded using a number of well known standards including, for example, the AC3 audio standard, the DV (Digital Video) standard, the MPEG (Motion Picture Expert Group) standard, the JPEG standard, the H.261 standard, the H.263 standard and the Motion JPEG standard to name a few. The encoding standards also specify the associated decoding processes as well. The multimedia contents are typically stored on the storage media and are transmitted as bitstreams which represent audio for video frames. In particular, the ATSC digital terrestrial transmission standard adopts the AC3 format for audio encoding and the MPEG2 format for video encoding.
MPEG is the compression standard for audio, video and graphics information and includes, for example, MPEG1, 2, 4 and 7. It is standardized in the ISO-IEC/JTC1/SC29/WG11 documents. MPEG1 is the standard for encoding audio and video data for storage on CD-ROM devices (compact disc read only memory). The MPEG1 specification is described in the IS-11393 standard. MPEG2 is the standard (adopted for ATSC) for encoding, decoding and transmitting video data for storage media, e.g., DVD (digital video disc), etc., and also for digital broadcasts. MPEG2 supports interlaced video while MPEG1 does not. Therefore, MPEG2 is used for high quality video displaying on TV units. The MPEG2 specification is described in IS-13818. The MPEG4 standard is used for encoding, decoding and transmitting audio, video and computer graphics data. It supports content based bitstream manipulation and representation. The specification is described in IS14496. MPEG7 is the standard of the meta information of multimedia (MM) contents. The example of the meta data is data is describes or is related to the MM contents, such as, identification and/or other descriptions of the author, producer information, directors, actors, etc. The MPEG7 standard is currently under standardization, and is in draft form but available. The draft specifications are described in the ISO-IEC-JTC1/SC29/WG11 documents.
One problem with using encoded digital audio information is that errors can occur between the transmission and reception of the audio data. The decoder unit can detect when a particular frame of the audio data contains error by using well known CRC checking schemes. In the past, the frame having the error would be muted by filling in the frame with zeros. This is called a hard mute. However, the hard mute, when played back, causes a very audible “pop” sound which is not pleasing to the ear nor does it sound natural. Therefore, an attenuation function or “window” was applied to the error frame to soften the mute. However, even soft mutes can have a “pop” associated therewith depending on the window function applied. Also, hard and soft mutes still have a duration of silence associated therewith that can be distinguished by the ear. Therefore, when many error frames are detected in the same bitstream neighborhood, these intermittent durations of silence (mutes) followed by sound (unmute) and silence again (mute) can be very unappealing to the ear and annoying and can also damage speaker systems.
Another problem with using encoding digital audio information involves muting commands and audio signal synchronization. For instance, if a user watching a program on a digital TV changes the current channel, the currently played AV information should stop incident to the channel change, e.g., mute the audio and freeze the video, then the channel should change. However, in conventional systems, audio artifacts can result because the audio may not mute fast enough as a result of situations described below. The video signal is used as a master and the audio signal is the slave in many encoding schemes. Also, the amount of playback time in a video frame may not be exactly the same as in a video frame in many encoding standards. Therefore, the audio frames and video frames are not exactly synchronized in the decoding and playback processes. Secondly, the channel change operation takes some time to complete because the AV system needs to parse the bitstream from the new channel and feed the data to corresponding audio and video decoders. This results in a situation where the audio and video are slightly delayed during decoding and playback. When the decoder receives a mute command, it is able to immediately freeze the video frame, because the video signal is the master. However, many decoded audio frames may be stored in the output buffer, resulting in some audio playback after the video freeze. This is very noticeable to the ear and confusing because the audio playback coincides with video frames that are not displayed simultaneously.