Multimedia streaming refers to continuous delivery of synchronized media data like video, audio, text, and animation. The term “streaming” is used to indicate that the data representing the various media types are provided over a network to a client computer on a real-time, as-needed basis, rather than being pre-delivered in its entirety before playback. Thus, the client computer renders streaming data as they are received from a network server, rather than waiting for an entire “file” to be delivered.
There has been a growing interest in the transmission of audio information (such as broadband multimedia) over data packet networks. In this technique, analog audio data are converted into digital data, and the digital data are encapsulated into packets suitable for transmission over a packet network, for example Internet. At the receiving end, the audio information data are extracted and presented to an output media device.
With the ever-increasing demand for transmission of vivid multimedia, streaming audio has become one of the important applications in the emerging 3G Mobile Network and Internet. A significant impediment to reliable transmission of multimedia over packet networks is packet loss. Packets may be lost for a variety of reasons. For example, congestion of routers and gateways may lead to a packet being discarded; delays in packet transmission may cause a packet to arrive too late at the receiver to be played back in real-time; or heavy loading of the workstations may result in scheduling difficulties in real-time multitasking operating systems. Moreover, impairments of communication channels such as noise, fading and network congestion, may give rise to packet loss during transmission, causing audio quality degradation. Since it is impractical to request for re-transmission of lost packet in real-time streaming applications, various methods have been proposed to reconstruct the lost packets at the receiver.
These methods include Silence Substitution, Packet Repetition, Pitch Waveform Replication, and Time Scale Modification. In Silence Substitution, lost packets are simply muted. In Packet Repetition, the previous packet is used in the place of lost packet. These two methods are primitive and cause very undesirable quality degradation, especially when the audio packet size is large. The Pitch Waveform Replication method employs a Pitch Detection Algorithm on either side of a lost packet, to find a suitable signal to cover the loss. This method is found to work better than the first two, however, it is not applicable to wideband audio where it is impossible/difficult to find the single pitch.
Time-scale modification (TSM) includes time-scale compression for speeding-up playback rate of the signal and time-scale expansion for slowing-down playback rate of the signal. TSM operates to stretch both sides or either side of the lost packet in order to cover the lost packet. One of the important steps in TSM is to find the best matched segments for overlap-and-add operation using correlation. The existing lost packet concealment technique employing Time Scale Modification uses the same segment matching parameters for the entire frequency band. These parameters are not accurate when applied to wide band signals, giving rise to more severe quality degradation in the low frequency band.
However, these existing methods are more applicable to speech communications, where the packet size is small and the bandwidth is narrow. When applied to high quality audio transmission, they normally fail to provide satisfactory results, as the packet size is larger and the frequency characteristics are more complicated.
Therefore, there is an imperative need to have a system and method for lost packet concealment so as to improve the quality of multimedia audio signals in high quality audio streaming applications. This invention satisfies this need by disclosing a Waveform Similarity Overlap-Add (WSOLA) based packet loss concealment method and system for broadband multimedia audio streaming applications. Other advantages of this invention will be apparent with reference to the detailed description.