The invention concerns generally the field of encoding and decoding a signal to be transmitted over a telecommunication connection. Especially the invention concerns the procedures of changing the signal bandwidth of such a signal during the course of the telecommunication connection.
FIG. 1 illustrates the general principle of transmitting speech from a first terminal to a second terminal in a digital cellular radio network. In the first terminal 100 there is a series connection of a microphone 101, a speech encoder 102, a channel encoder 103, a modulator 104 and a radio transmitter 105. In a first base station 110 there is a series connection of a radio receiver 111, a demodulator 112, a channel decoder 113 and a line transmitter 114. From the first base station 110 to a second base station 120 there is a network connection 115. The second base station 110 comprises a series connection of a line receiver 121, a channel encoder 122, a modulator 123 and a radio transmitter 124. In a second terminal 130 there is a series connection of a radio receiver 131, a demodulator 132, a channel decoder 133, a speech decoder 134 and a loudspeaker 135.
The speech encoder 102 in the transmitting terminal 100 converts the analogue speech signal that comes from the microphone 101 into a digital signal by applying a certain speech encoding scheme. The channel encoder 103 adds redundancy to the digital signal in order to enhance its robustness against corrupting effects at the radio interface. The channel decoder 113 removes at least partly the channel decoding, because wired connections through the network 115 are much more reliable than radio connections and excessive channel coding would only consume transmission capacity in the network. A corresponding pair of channel encoding 122 and channel decoding 133 exists around the second radio interface. The speech decoder 134 reconverts the digital speech signal into analog by applying a procedure that is an inverse of the above-mentioned speech encoding scheme. The principles described above are easily generalized to the transmission of arbitrary information between terminals by replacing the microphone 101 with a generic data source, the speech encoder 102 with a source encoder, the speech decoder 134 with a corresponding decoder and the loudspeaker 135 with a generic data sink.
An encoding and decoding unit is usually referred to as a codec. The specifications of conventional digital cellular radio systems like the original GSM (Global System for Mobile telecommunications) typically define speech (or source) codecs that have a constant output bit-rate and that handle a speech (or source) signal the bandwidth of which is constant. Depending on the bandwidth the conventional speech codecs have been designated as either narrowband or wideband codecs. For example the so-called RPE-LTP full-rate speech codec described in the GSM standard number GSM 06.10 is a narrowband speech codec the bandwidth of which is approximately 3.5 kHz. Its bit-rate in speech coding is 13 kbit/s and in channel coding 9.8 kbit/s which together makes 22.8 kbit/s. Exemplary wideband speech codecs are those standardized by the ITU (International Telecommunication Union) under the designations G.722-64, G.722-56 and G.722-48. Their speech coding bit-rates are 64, 56 and 48 kbit/s respectively, and their bandwidth is approximately 7 kHz.
Recent proposals for enhancements to the known arrangements in speech (or source) coding include the concept of AMR or Adaptive MultiRate coding. The idea is to keep the bit (or symbol) rate at the output of the channel encoder 103 constant but to allow the roles of the speech encoder 102 and the channel encoder 103 to change in generating the constant bit-rate. The input bandwidth of the speech encoder is constant (in GSM AMR, the same 3.5 kHz as in the basic GSM speech codec mentioned above), but if the speech encoder is allowed to use more bits per time unit, better audible quality can be achieved. Using a larger portion of the available bit-rate for speech coding is only possible on condition that the corruptive effects of noise and interference of the moment are not too bad. At the receiving end the AMR concept means that the bit (or symbol) rate at the input of the channel decoder 133 is constant, but the amount of redundancy removed in the channel decoder and correspondingly the amount of digital information per time unit available for reconstructing the original analog speech signal in the speech decoder 134 may vary.
At the priority date of the present patent application the known AMR speech coding principle is going to be adopted in standardizing a wideband or 7 kHz speech codec for future use within the GSM frameworks. It is possible that in the near future there will be communication devices in use which have two selectable speech (or source) bandwidths: 3.5 kHz and 7 kHz. It is also possible that even more speech (or source) bandwidths will be defined. The bandwidths can be associated with the use of completely different codecs or they may represent just certain modes of operation, known as the codec modes or just modes, of the speech encoding and decoding arrangements. The application of the AMR principle means that a future speech (or source) codec may have both a selectable bandwidth and a changing bit-rate, where the latter is associated with different levels of error protection through different distributions of the available gross bit-rate between speech (or source) coding and channel coding.
FIG. 2 illustrates in more detail the contents of the speech encoder block 102 in a transmitting mobile station and the speech decoder block 134 in a receiving mobile station in a known exemplary case where two different speech bandwidths have been defined. Here the concepts of encoding and decoding are understood in a wide sense so that e.g. A/D and D/A conversions are parts thereof. The A/D converter 201 in the encoder 102 is coupled to a switching block 202 both directly and through a downsampling block 203. The output of the switching block 203 is coupled to a speech encoder proper 204 which is capable of handling both a wideband and a narrowband input signal. The communication channel 210 between the output of the speech encoder proper 204 and the input of a corresponding speech decoder proper 220 in the speech decoder block 134 comprises generally e.g. all channel encoding/decoding and transmitting/receiving arrangements. The speech decoder proper 220 is capable of decoding both wideband and narrowband speech signals, and the output thereof is coupled to a switching block 221 both directly and through an upsampling block 222. The output of the switching block 221 is coupled to a speech synthesizer and D/A converter 223.
The A/D converter 201 in the encoder block 102 and the D/A converter 223 in the decoder block 134 both handle a sampling rate that is high enough for the widest defined speech bandwidth. The downsampling block 203 reduces the sampling rate of the sample stream produced by the A/D converter 201 to a lower level by puncturing, filtering or interpolating, and the upsampling block 222 inflates the sampling rate of the sample stream produced by the speech decoder proper 220 to a higher level by some calculational means. As a response to a bandwidth change command the speech encoder 204 and decoder 220 switch to encoding and decoding procedures that correspond to the new bandwidth, and simultaneously the switching blocks 203 and 221 select either the direct couplings (in the case of wider bandwidth) or those going through the downsampling block 203 and upsampling block 222 (in the case of narrower bandwidth). Multiple bandwidths can be achieved by programming the speech encoder 204 and decoder 220 for multiple bandwidths and by providing multiple parallel downsampling blocks in the transmitting station and upsampling blocks in the receiving station (or by programming the downsampling block 203 and upsampling block 222 for multiple down/upsampling ratios).
The existing definitions of the AMR arrangements include the drawback that changing from one source encoding bandwidth to another tends to cause noticeable artefacts in the transmitted signal. For example changing between two different speech codec modes with different bandwidths causes the listening user at the receiving end to notice a strange audible effect in the speaker""s voice.
As additional background to the invention we describe briefly the known Tandem Free Operation or TFO arrangement which is used to convey a connection between mobile terminals (a MS-MS-connection, where MS comes from Mobile Station) where wideband speech coding is used. For the sake of brevity we will denote a signal that carries speech encoded with wideband (narrowband) speech coding simply as wideband (narrowband) speech.
The use of two complete encoder-decoder pairs which was described in association of FIG. 1 is known as tandem operation and it is necessary especially if the network connection 115 goes through a public switched telephone network or PSTN of generally unknown nature. In a more advantageous case the terminals 100 and 130 are both mobile stations of a digital cellular radio system, and the network connection 115 is truly digital and capable of establishing transparent digital channels between certain transcoder and rate adaptor units or TRAUs that operate either within base stations or under the control of base stations.
FIG. 3 illustrates an arrangement where a first TRAU 300 is functionally associated with the first base station 110 and a second TRAU 310 is functionally associated with the second base station 120. Each TRAU 300 and 310 comprises a decoder 301, 311; an uplink TFO unit 302, 312; an encoder 303, 313; a downlink TFO unit 304, 314; and a TFO Protocol unit 305, 315. In each TRAU the decoder 301, 311 and uplink TFO unit 302, 312 are coupled in parallel to receive the uplink frames from the mobile station, and their outputs are combined through the use of a combiner 306, 316. Similarly the encoder 303, 313 and downlink TFO unit 304, 314 are coupled in parallel to receive the transmission frames from the other TRAU, and their outputs go through a selection switch 307, 317. The digital network 320 consists of IPEs (In Path Equipment), of which the IPEs 321 and 322 are shown, and is capable of establishing transparent 64 kbit/s channels in both directions between the TRAUs. The first base station 110 operates under the control of a first base station controller 330, which in turn is part of a communication domain governed by a first mobile services switching centre 340. The second base station 120 operates under the control of a second base station controller 350, which in is part of a communication domain governed by a second mobile services switching centre 360. There are control connections from the base station controllers 330 and 350 to respective ones of the TFO Protocol units 305 and 315.
The document xe2x80x9cGSM 04.53 version 1.6.0 (1998-10); Digital cellular telecommunications system (Phase 2+); Inband Tandem Free Operation (TFO) of Speech Codecs; Service Description; Stage 3xe2x80x9d, published by the ETSI (European Telecommunications Standards Institute) and incorporated herein by reference, defines an inband signalling protocol for testing for the transparency of the channels, the TFO supporting capability of both TRAUs and the identicality of speech codecs at both radio interfaces. Given that the tests succeed, the TFO Protocol units 305 and 315 establish the TFO connection by commanding the signal paths to go transparent and bypassing the decoder/encoder functions within the TRAUs 300 and 310. The TFO specifications also define a fast fall back procedure for sudden TFO interruption and provide support for resolution in codec mismatch situations and cost efficient transmission within the fixed part 320 of the network.
The first mobile station 370 which communicates with the first base station 110 comprises an encoder 371 and a decoder 372. Correspondingly the second mobile station 380 which communicates with the second base station 120 comprises a decoder 381 and an encoder 382. The TFO procedures referred to above serve to establish a virtually transparent connection from the encoder 371 of the first mobile station 370 to the decoder 381 of the second mobile station 380 and from the encoder 382 of the second mobile station 380 to the decoder 372 of the first mobile station 370.
It is an object of the invention to present a method and an arrangement for changing source bandwidths without the above-described drawbacks of the prior art arrangements. It is an additional object of the invention to present a method and an arrangement for changing source bandwidths so that the human users at the ends of a telephone connection notice essentially no audible artefacts due to bandwidth changes. Another object of the invention is to present a method and an arrangement of the above-described kind with only a reasonable level of complexity in implementation.
The objects of the invention are achieved by introducing the concept of soft bandwidth switching, where the acoustic bandwidth is gradually changed from a first level that corresponds to a first codec mode to a second level that corresponds to a second codec mode.
The method for changing the bandwidth of a speech signal in association with multiple mode encoding or decoding according to the invention is characterized in that it comprises the steps of:
receiving an instruction for changing speech signal bandwidth and
gradually changing the bandwidth of a speech signal processed in a multiple mode speech encoding or decoding arrangement as a response to said instruction for changing speech signal bandwidth.
The invention applies also to a speech encoding arrangement comprising:
a speech signal input and
a multiple mode speech encoder for encoding speech signals coupled to the speech signal input selectabily with a first encoding mode associated with a first bandwidth or a second encoding mode associated with a second bandwidth;
it is characterized in that it comprises a soft bandwidth switching block with an input coupled to the speech signal input and an output coupled to the multiple mode speech encoder, said soft bandwidth switching block being arranged to gradually change the bandwidth of a speech signal coupled to the multiple mode speech encoder as a response to an instruction for changing speech signal bandwidth.
The invention applies further to a speech decoding arrangement comprising
a speech signal input and
a multiple mode speech decoder for decoding speech signals coupled to the speech signal input selectabily with a first decoding rate associated with a first bandwidth or a second decoding rate associated with a second bandwidth;
it is characterized in that it comprises a soft bandwidth switching block with an input coupled to the multiple mode speech decoder and an output, said soft bandwidth switching block being arranged to gradually change the bandwidth of a speech signal received from the multiple mode speech decoder as a response to an instruction for changing speech signal bandwidth.
Additionally the invention applies to a digital radio telephone and a transcoder and rate adaptor unit of a cellular radio system which have the characteristic feature of comprising at least one of a speech encoding arrangement or a speech decoding arrangement of the above-described kind.
In a vast majority of telephone applications the acoustic signal conveyed through a connection is speech, so instead of general acoustic bandwidth we may talk about the speech bandwidth. However, the use of the term xe2x80x9cspeechxe2x80x9d should not be construed as a limitation to the applicability of the invention.
A natural speech signal comprises a wide range of frequency components, and reducing the speech bandwidth inevitably removes some of these components causing various amounts of distortion. In the existing systems there may occur a switching moment during active speech so that the speech bandwidth changes abruptly. This causes audible artefacts, because the amount and nature of distortion also changes abruptly. According to the invention there is introduced a smoothing period during which the speech bandwidth changes gradually. The human sensory system does not perceive gradual changes in speech distortion as easily as abrupt changes, so the smoothing period improves the auditory impression that the users get.
The invention may be applied in an encoding device, where the smoothing period is most advantageously introduced before the actual speech encoder or as a part thereof. The invention may also be applied in a decoding device, where the smoothing period is most advantageously introduced after the actual speech decoder or as a part thereof. In both cases (encoding device or decoding device) the means for introducing the smoothing period typically comprise adjustable gain units on parallel signal paths, each of which conveys a part of the acoustic spectrum. The adjustable gain units may be replaced or complemented with adjustable filters on said signal paths.
Regarding larger speech (or acoustic) bandwidths, the additional frequency components may not always be available due to the nature and operation of the communication system where the invention is applied. Therefore the arrangement according to the invention advantageously comprises a noise generator that can be used to replace missing additional frequency components. The wideband speech (or acoustic) signal is then a weighted combination of basic frequency components, additional frequency components and noise.