1. Technical Field
The invention is related to receipt and playback of packet-based audio signals, and in particular, to a system and method for providing improved packet loss concealment for overlapped transform encoded signals broadcast across a packet-based network or communications channel.
2. Related Art
Conventional packet communication systems, such as the Internet or other broadcast networks, are typically lossy. In other words, not every transmitted packet can be guaranteed to be delivered either error free, on time, or even in the correct sequence. Further, any delay in delivery time is usually variable. If the receiver can wait for packets to be retransmitted, correctly ordered, or corrected using some type of error correction scheme, then the fact that such networks are inherently lossy and delay prone is not an issue. However, for near real-time applications, such as, for example, voice-based communications systems across packet-based networks, the receiver can not wait for packets to be retransmitted, correctly ordered, or corrected without causing undue, and noticeable, lag or delay in the communication.
Many conventional schemes address minor delays in packet delivery time by simply providing a temporary buffer of received packets in combination with a delayed playback of the received packets. Such schemes are often referred to as “jitter control” schemes. In general, most such schemes address delay in packet receipt by using a “jitter buffer” or the like which temporarily stores incoming packets or signal frames and provides them to a decoder with sufficient delay that one or more subsequent packets should have already been received. In other words, the jitter buffer simply keeps one or more packets in a buffer for delaying playback of the incoming signal for a period long enough to ensure that a majority of packets are actually received before they need to be played.
A sufficient increase in the length of the buffer allows virtually all packets to be received before they need to be played back. In fact, if the size of the jitter buffer is at least as long as the difference between the smallest and largest possible packet delays, then all packets could be played without any apparent gap or delay between packets. Unfortunately, as the length of the buffer increases, playback of the signal increasingly lags real-time. In a one-way audio signal, such as a music broadcast, for example, this is typically not a problem. However, in systems such as real-time or two-way conversations, temporal lag resulting from the use of such buffers becomes increasing apparent, and undesirable, as the buffer length increases.
In addition, the basic idea of using a buffer has been improved in many modern communications systems by using compression and stretching techniques for providing temporal adjustment of the playback duration of signal frames. As a result, the jitter buffer length can be adapted during speech utterances by stretching or compressing the currently playing audio signal, as necessary, for reducing the average delay without incurring as many late losses. Unfortunately, the use of temporal stretching and compression techniques for frames in an audio signal often results in audible artifacts which may be objectionable to the human listener.
Consequently, an additional conventional technique, commonly referred to as “packet loss concealment,” has been used to further improve the perceived speech quality in the presence of lost or overly delayed packets. As noted above, packet loss may occur when overly delayed packets are not received in time for playback. Typically, such overly delayed packets are referred to as “late loss” packets. Similarly, packet loss may also occur simply because the packet was never received. Either way, conventional packet loss concealment schemes typically address overly delayed and lost packets in the same manner by using some sort of packet loss concealment technique. In general, packet loss concealment techniques operate to conceal or hide the fact that a packet that should be played has not been received. In addition, packet loss concealment techniques are frequently used in combination with the aforementioned jitter control techniques.
In general, with packet loss concealment techniques, when a packet does not arrive by the scheduled time, it is declared to be a late loss, and error concealment is then used to hide that loss. Most modern schemes use some form of stretching and compression in combination with a windowing technique for merging boundaries of packets bordering missing packets declared to be late loss packets. In general, such schemes typically operate by decomposing input packets into overlapping segments of equal length. These overlapping segments are then realigned and superimposed via a conventional correlation process along with smoothing of the overlap regions to form an output segment having a degree of overlap which results in the desired output length. The result is that the composite segment is useful for hiding or concealing perceived packet delay or loss. Unfortunately, in the case of overlapped transform coders, the composite signal segments generated by conventional packet loss concealment techniques fail to fully exploit the partial information available from partially received neighboring samples (i.e., packets on either or both sides of a lost data packet).