Wireless and voice-over-internet-protocol (VoIP) communications are frequently subject to adverse connection conditions. Signals transmitted under such conditions suffer from bit errors which result in the signal quality at the receiving end of the communication being degraded. A degraded packet of a signal comprising one or more bit error is generally considered either to be lost or damaged. The distinction usually depends on the position of the bit error within the packet. Typically, if the header of a packet comprises a bit error, the packet is considered to be lost and the entire packet is rejected. However, if the payload of the packet comprises one or more bit errors (often called residual bit errors) but the header does not also comprise a bit error, the packet is considered to be damaged.
In the case of voice communications, degraded packets result in clicks and pops or other artifacts being present in the output voice signal at the receiver. These artifacts degrade the perceived speech quality of the output voice signal and may render the speech unrecognisable if the bit error rate is sufficiently high.
Broadly speaking, two approaches are taken to combat the problem of bit errors. The first approach is the use of transmitter-based recovery techniques. Such techniques include retransmission of degraded packets (for example automatic repeat-request) and adding error correction coding bits to the transmitted packets such that bit errors can be detected at the receiver and the degraded packets can be reconstructed (for example forward error correction). Transmitter based recovery techniques suffer from high power consumption and high bandwidth requirements. Additionally, in poor signal conditions they inherently lead to network congestion and delays. These techniques are suited to data transmissions for which the integrity of the received signal is of overriding importance. Generally, for voice communications, the output voice signal is required to meet two criteria: intelligibility and a sufficiently high perceptual quality. Exact reproduction of the transmitted signal in the output voice signal is not necessary for these criteria to be met.
The second approach taken to combating the problem of degraded packets is the use of receiver-based concealment techniques. Receiver-based concealment techniques are generally used in addition to transmitter-based recovery techniques to conceal any remaining degradation left after the transmitter-based recovery techniques have been employed. In voice and multimedia communications interpolation-based techniques are commonly used. These techniques generate a replacement packet by interpolating parameters from the packets on one or both sides of the degraded packet. A parameter commonly chosen for interpolation is the pitch period. A waveform of the estimated pitch period or a multiple of the estimated pitch period is generated as a substitute for the degraded packet. These so-called pitch based waveform substitution techniques are limited in that they are typically only effective when the packet loss rate is low (less than about 15%) and when handling short packets of between 4 ms and 40 ms.
When treating packet errors with pitch based waveform substitution techniques, all the information in a degraded packet is discarded regardless of whether that degraded packet is lost or damaged. However, damaged packets often contain information that can be used to improve the quality of the output voice signal. Approaches utilising the information in accessible damaged packets have been suggested. The simplest approach is to output the damaged packet without treating the bit errors in the payload.
Another approach is called soft decision source decoding (SDSD). SDSD, described in Softbit speech decoding: a new approach to error concealment (Fingscheidt and Vary; IEEE Transactions on speech and audio processing, vol 9, March 2001), is a statistical approach to concealing errors in a bit stream. Voice signals are analyzed at the bit stream level. An attempt is made to reproduce the original source bit sequence from a corrupted bit sequence using a statistical model. The probabilities that particular bit combinations were transmitted given the bit combination received are estimated using the statistical model, information about the codec, and the channel conditions when the signal was received. Optimum values for the original source bit sequence are determined and decoded conventionally.
U.S. Pat. No. 5,799,039 describes a further approach. The received voice signal is analyzed to determine a degree of corruption of the signal. The specific nibbles likely to be corrupted are identified. A look-up table that links predetermined replacement values to each nibble value is stored. The look up table is suitable for use with an ADPCM (adaptive differential pulse code modulation) bit stream. One of the predetermined replacement values is selected in dependence on the degree of corruption to replace each corrupted nibble. The replaced nibbles and the remaining signal are passed to a decoder to be decoded conventionally.
Although these two approaches utilize parts of a damaged packet that are not corrupted, they are limited in that they require access to the encoded bit stream and possibly also to the codec. Such access is not readily available in codecs that are implemented in hardware. Additionally, systems implementing each of these approaches are codec dependent in that further statistical models and look up tables would need to be designed and implemented in order to handle further codecs.
A further error concealment method has been proposed (Cheetham and Nasr; Error concealment for voice over WLAN in converged enterprise networks; 15th IST Mobile & Wireless Communications Summit 2006) that targets specific samples within a damaged voice packet to be concealed. A short term linear predictor is used at the receiver to generate predicted samples within a damaged packet which are expected to be close to the transmitted samples. Such a technique has been applied to a WINDECT VoWLAN system where packets contain 20 ms of voice encoded by G711 log-PCM. Bit errors in the voice stream are indicated by a parity check over the most significant four bits of each sample. Received samples which fail the parity check are replaced by their corresponding predicted values. Additionally, an attempt may be made to identify which of the four most significant bits is in error. The identified bit is then inverted.
If used in combination with a packet degradation concealment system, this method is likely to require substantial additional memory and greater computational complexity than other proposed methods. This is problematic for low power resource constrained platforms such as Bluetooth.
There is thus a need for an improved bit error concealment system that is robust, codec independent and of low computational complexity such that it is suitable for low power platforms.