The transmission of audio signals in compressed digital packet formats, such as MP3, has revolutionized the process of music distribution. Recent developments in this field have made possible the reception of streaming digital audio with handheld network communication devices, for example. However, with the increase in network traffic, there is often a loss of audio packets because of either congestion or excessive delay in the packet network, such as may occur in a best-effort based IP network.
Under severe conditions, for example, errors resulting from burst packet loss may occur which are beyond the capability of a conventional channel-coding correction method, particularly in wireless networks such as GSM, WCDMA or BLUETOOTH. Under such conditions, sound quality may be improved by the application of an error-concealment algorithm. Error concealment is an important process used to improve the quality of service (QoS) when a compressed audio bitstream is transmitted over an error-prone channel, such as found in mobile network communications and in digital audio broadcasts.
Perceptual audio codecs, such as MPEG-1 Layer III Audio Coding (MP3), as specified in the International Standard ISO/IEC 11172-3 entitled “Information technology of moving pictures and associated audio for digital storage media at up to about 1,5 Mbits/s—Part 3: Audio,” and MPEG-2 Advanced Audio Coding (AAC), use frame-wise compression of audio signals, the resulting compressed bitstream then being transmitted over the audio packet network. With rapid deployment of audio compression technologies, more and more audio content is stored and transmitted in compressed formats.
A critical feature of an error concealment method is the detection of beats (i.e., short transient signals) so that replacement information can be provided for missing data. Beat detection or tracking is an important initial step in computer processing of music and is useful in various multimedia applications, such as automatic classification of music, content-based retrieval, and audio track analysis in video. Systems for beat detection or tracking can be classified according to the input data type, that is, systems for musical score information such as MIDI signals, and systems for real-time applications.
Beat detection, as used herein, refers to the detection of physical beats, that is, acoustic features or other signal transients exhibiting a higher level of energy, or peak, in comparison to the adjacent audio stream. Thus, a ‘beat’ would include a drum beat, but would not include a perceptual musical beat, perhaps recognizable by a human listener, but which produces little or no sound.
However, most conventional beat detection or tracking systems function in a pulse-code modulated (PCM) domain. They are computationally intensive and not suitable for use with compressed domain bitstreams such as an MP3 bitstream, which has gained popularity not only in the Internet world, but also in consumer products. A compressed domain application may, for example, perform a real-time task involving beat-pattern based error concealment for streaming music over error-prone channels having burst packet losses.
The wireless channel is another source of error that can also lead to packet loss. Under such conditions, sound quality may be improved by the application of an error-concealment algorithm. Error concealment is usually a receiver-based error recovery method, which serves as the last resort to mitigate the degradation of audio quality when data packets are lost in audio streaming over error prone channels such as mobile Internet.
As can be appreciated by one skilled in the relevant art, streaming uncompressed audio over wireless channel is simply an uneconomic use of the scarce resource, and a compressed audio bitstream is more sensitive to channel errors in comparison with an uncompressed bitstream (after removing most of the signal redundancy and irrelevance).
Conventional error concealment schemes employ small segment (typically around 20 msec) oriented concealment methods including: muting, packet repetition, interpolation, time-scale modification, and regeneration-based schemes. However, a fundamental limitation of packet repetition and other existing error concealment schemes is that they all operate with the assumption that the audio signals are short-term stationary. Thus, if the lost or distorted portion of the audio signal includes a short transient signal, such as a drumbeat, the conventional methods will not be able to produce satisfactory results.
What is needed is an audio data decoding and error concealment system and method operative in a compressed domain which provides high accuracy with a relatively less complex system at the receiver end.