The present invention relates to a signalling protocol and an apparatus enabling a transmitter in a speech-transmitting digital telecommunications system to transmit predetermined messages to a receiver. In many digital telecommunications systems, it is necessary to transmit not only encoded speech and/or other information but also messages that may for example relate to the control of that particular connection or that may transfer data fully independent of the information to be transmitted. Such messages are often called signalling. To provide an illustrative description within the scope of this application, the term xe2x80x9cspeechxe2x80x9d is used even though the information to be transmitted in the system may comprise other types of sound, music, a video signal, multimedia, etc. instead of or in addition to speech. In terms of a practical embodiment, the invention is disclosed in the context of a mobile communications system, particularly a speech channel in the GSM system. It is to be borne in mind, however, that the technology in accordance with the invention is suitable for use in many other environments as well.
FIG. 1 shows the parts of a cellular mobile communications system essential for understanding the invention. Mobile stations MS communicate with Base Transceiver Stations BTS over the air interface Um. The base stations are controlled by Base Station Controllers BSC associated with Mobile Services Switching Centres MSC. A subsystem administered by a base station controller BSCxe2x80x94including the base stations BTS controlled by itxe2x80x94is commonly called a Base Station Subsystem BSS. The interface between an exchange MSC and a base station subsystem BSS is called the A interface. The part of the A interface on the MSC side is called a Network Subsystem NSS. The interface between a base station controller BSC and a base station BTS is called the Abis interface. The mobile services switching centre MSC switches incoming and outgoing calls. It performs tasks similar to those of the exchange of a public switched telephone network PSTN. Additionally, it performs tasks characteristic of mobile telecommunications only, such as subscriber location management, in co-operation with network subscriber registers (not separately shown in FIG. 1). A Transcoder and Rate Adaptation Unit TRAU is an element of the base station subsystem BSS and may be located in association with the base station controller BSC, as shown in this figure, or also in association with the mobile services switching centre, for example. The transcoders convert speech from digital format into another format, for instance convert the 64 kbit/s A-law PCM from the exchange over the A interface into encoded speech of 13 kbit/s to be sent to the base station line and vice versa. Rate adaptation for data is carried out between the rate 64 kbit/s and the rates 3.6, 6, or 12 kbit/s.
In digital telecommunications systems transmitting speech, a speech signal is usually subjected to two coding operations: speech coding and channel coding. Speech coding comprises speech encoding performed in the transmitter by a speech encoder, and speech decoding performed in the receiver by a speech decoder.
FIG. 2 illustrates various operations to be performed on the speech. The most significant steps in view of the present invention include speech encoding and decoding and channel encoding and decoding. In the GSM system, for example, channel encoding in the network is performed at the base station, whereas speech encoding is performed in a discrete transcoder unit that may be located remote from the base station and even when located at the base station is a fully separate logic unit. References Tx and Rx will be explained in connection with FIG. 4. FIG. 2 further illustrates an exemplary frame F, comprising a header H, a payload portion P, and a check portion C. The frame F also often contains bit patterns for synchronization. The header H typically comprises the identifiers of the sender and receiver of the frame, a consecutive number for the frame, or the like. The actual information is carried in the payload portion P. Parts essential to the present invention include the payload portion P and the check portion C. The check portion C is usually implemented in the form of a cyclic redundancy check (CRC) value, but it may also be a parity having one or more bits, or equivalent. Essential to the invention is mainly the fact that the system in some way defines a xe2x80x9cgoodxe2x80x9d and a xe2x80x9cbadxe2x80x9d frame, which may be distinguished from one another by means of an implicit or explicit information element in the frame, permitting the system to conclude whether the frame has been transferred correctly. In the present context, xe2x80x9cimplicitxe2x80x9d means that, as is well-known, the cyclic redundancy check (CRC) value does not directly indicate whether the frame is good or bad, but the receiver calculates the CRC value from the frame and compares it with the check sum sent with the frame. If the check sums are identical, the frame is good. An xe2x80x9cexplicitxe2x80x9d indicator of a bad frame is for instance the Bad Frame Indicator BFI used in the fixed parts of a telephone network.
FIG. 3 illustrates the type of message transmission most widely known in the art. FIG. 1 shows both a transmitter 100 and a receiver 102. In this arrangement, messages and speech are transmitted on completely different channels. In the transmitter 100, a digital speech signal 104 is supplied to a speech encoder 106, which, from this signal, generates compressed speech coding bits, which are sent to the receiver on a speech channel 108. In the transmitter, a message 114 to be sent to the receiver is supplied to a message encoder 116, which generates message bits, which are then sent to the receiver on a separate message channel 118. The receiver 102 receives the speech coding bits from a speech channel 108 and supplies them to a speech decoder 110, which synthesizes the speech signal 112 to be heard. The receiver 102 receives the message bits from a separate message channel 118 and supplies them to a message decoder 120, which interprets the transmitted message 122.
A speech encoder 106 located in a transmitter 100 compresses a speech signal so that the number of bits used to represent it per unit of time is reduced. The speech encoder 106 typically processes speech as speech frames containing a certain amount of speech samples. On the basis of sampled speech, the speech encoder 106 calculates speech parameters, each of which is encoded as a separate binary code word. The speech parameters produced by the RPE-LTP speech encoder used in the full-rate channel of the pan-European GSM mobile telephone system are described in ETSI GSM Recommendation 06.10. These parameters are also disclosed in Table 1 of Appendix 1. The RPE-LTP (Regular Pulse Excitationxe2x80x94Long Term Prediction) produces 76 speech parameters from one speech frame of 20 ms (corresponding to 160 speech samples at a sampling frequency of 8 kHz). Recommendation GSM 06.10 also discloses the length of the binary code word assigned for each parameter.
Very often speech encoders also group speech parameters together, in which case each groupxe2x80x94instead of a single speech parameterxe2x80x94is encoded into a separate code word. Encoding parameters in groups is called vector quantization. Modem speech encoders usually encode some speech parameters separately and some in groups (the RPE-LTP speech encoder of the example does not employ vector quantization). The RPE-LTP speech encoder of the invention produces 260 speech coding bits per each speech frame of 20 ms.
The speech decoder 110 of a receiver 102 performs a reverse operation and synthesizes a speech signal 112 from the bits produced by the speech encoder. The decoder 110 receives binary code words and generates corresponding speech parameters on the basis of them. The synthesization is performed by the use of the decoded speech parameters. The speech synthesized in the receiver is, however, not identical with the original speech compressed by the speech encoder, but it has changed more or less as a result of the speech coding. The higher the degree of compression used in the speech coding, the more the quality of speech usually deteriorates in the coding process.
The RPE-LTP speech encoder, for example, compresses a speech signal to a rate of 13 000 bits per second (13 kbps). The compression is performed in such a way that it affects the intelligibility of speech as little as possible. In special cases, such as identification of tone pairs used in tone dialling, the compression may detrimentally affect or even completely obstruct the process.
The above-mentioned channel coding comprises channel encoding performed in the transmitter by a channel encoder, and channel decoding performed in the receiver by a channel decoder. The purpose of channel coding is to protect speech coding bits to be transmitted against errors occurring on the transmission channel. Channel coding may either allow transmission errors to be detected without being able to correct them or it may allow transmission errors to be corrected, provided that the number of errors is smaller than a certain maximum number, which is dependent on the channel coding method.
The channel coding method to be used is selected according to the quality of the transmission channel. In fixed transmission methods, the error probability is often very small, and there is not much need for channel coding. In wireless networks such as mobile telephone networks, however, the error probability is often extremely high, and the channel coding method employed has a significant effect on the quality of speech. In mobile telephone networks, both error-detecting and error-correcting channel coding methods are usually employed simultaneously.
In telecommunications systems transmitting speech, speech coding and channel coding are closely connected. The importance of bits produced by a speech encoder for the quality of speech usually varies such that, in some cases, an error in an important bit may cause an audible disturbance in synthesized speech, whereas several errors in less important bits may be almost imperceptible. How great the difference between the importance of speech coding bits is depends essentially on the speech coding method employed, but at least small differences can be found in most methods. When a speech transmission method is developed for a telecommunications system, channel coding is thus designed together with speech coding to allow better protection for bits most important to the quality of speech than for less important bits. In a full-rate channel of the GSM, for example, the bits produced by an RPE-LTP speech encoder have been divided into three different classes according to their importance to channel coding: the most important class is protected in channel coding with both an error-correcting and an error-detecting code; the second most important class is protected with an error-correcting code only; and the least important class is not protected in channel coding at all. Table 2 of Appendix 1 shows the classification of bits produced by an RPE-LTP encoder in two different ways: 6-parted subjective classification, and 3-parted classification used by channel coding.
Channel coding is not directly relevant to the principle of the invention. In view of speech coding, channel coding is part of the transmission channel. In view of the practical implementation, channel coding is, however, of essential significance to the transmission of messages as regards the selection of bits, as will be seen from the examples below.
The term xe2x80x9cchannelxe2x80x9d can be interpreted in many ways in the field, wherefore the meaning of the term for the present invention can be specified as follows. When messages and speech are transmitted on separate channels, the receiver can distinguish between message bits and speech coding bits irrespective of the content of the information transmitted on the channels. However, two channels are not necessarily physically separate channels. Separate channels can also be provided by dividing one physical transmission channel (e.g. a radio path or a transmission line) into a plurality of time slots and frequency ranges. When such a division is made unambiguously, the receiver can distinguish between the channels irrespective of the content of the information transmitted on them.
A problem arises when the telecommunications system is to be changed in a way that was not anticipated when the system was planned. Let us assume, for example, that more than two speech codecs are to be used in a GSM system. Signalling for this kind of selection has not been designed in the system, and if it is designed afterwards, it cannot be implemented in old equipments that are already in use. To solve this problem, it is necessary to have a signalling method that can be introduced into an existing telecommunications system without disturbing those equipment in use in which this signalling method is not implemented. Using such a method, new equipments can signal with each other to agree on the use of a new codec; the signalling will not be successful with old equipment, and thus the new equipment can conclude that the old speech codec must be used on the connection. Message transmission methods previously used in the field do not usually allow messages to be added to an existing system.
It is possible to design various signalling possibilities for unpredictable cases in advance. If such a signalling possibility exists in the system, it should primarily be usedy. However, such reserve signalling does not often exist, or its introduction may require a time-consuming standardization process. Since there is, in any case, a limited number of reserve signalling possibilities, such signalling cannot be introduced very lightly.
An example of signalling that is designed in advance is the use of a speech coding method. Since the speech encoder of the transmitter and the speech decoder of the receiver must use the same speech coding method, the equipments must agree on the method to be used when the speech connection is being established. Such a situation will arise in the GSM system, for example, where a half-rate speech codec will soon be introduced in addition to the full-rate speech codec. In the GSM system, the problem of selecting the speech codec has been solved in such a way that when the system was planned, it was already known that there would be two speech codecs even though only one of them is implemented in the present equipments. A signalling method for selecting the speech codec has already been designed in the system in advance. The signalling is implemented in the present equipments, and when new equipments with two speech codecs are introduced later, the new equipments can use the old speech codec when communicating with the old equipments, since the selection of a speech codec is implemented in both the old and the new equipments.
Similar message transmission is needed for example in negotiations on the use of an echo canceller. On end-to-end connections of a data transmission system, e.g. a telephone network, long propagation delays often occur, as a result of which an echo is detected for example in the case of normal speech when a signal is reflected from the remote end of the connection back to the transmitting party. Mainly two factors contribute to the generation of an echo: acoustic echo between the receiver and the microphone of a telephone, and electrical echo which is generated in the transmission systems of the transmission and reception directions of the connection. It is usually endeavoured to eliminate the problems caused by the returned echo by means of an echo canceller. An echo canceller is a device processing a signal, such as a speech signal, that serves to reduce the echo by subtracting the estimated echo from the echo (signal) occurring on the connection. An echo suppresser, in turn, disconnects the signal arriving from the remote end when an echo is present.
Present-day digital mobile communications systems are provided with echo cancellers that prevent an echo returning from the public switched telephone network (PSTN) from being transmitted to the mobile subscriber. In mobile exchanges, echo cancellers of this kind are usually located in the inter-exchange trunk circuits.
An echo returning from a mobile station is usually cancelled by means of an echo canceller located in the actual mobile station. Such an echo canceller is usually based on an adaptive filter or on comparing the levels of an output signal and an input signal. There are a wide variety of mobile stations in use nowadays in which the echo cancellation does not work sufficiently well, but a relatively low-level, yet disturbing, echo is transmitted to the other party. In principle, the problem can be alleviated by developing echo elimination methods for mobile stations, but this mainly improves the situation as far as new mobile stations are concerned. It is difficult, however, to update the software or equipment of mobile stations that are already in use, because the mobile stations are already in possession of their users, and collecting them for service measures is time-consuming and costly. In a mobile communications system, there will thus always be such mobile stations whose echo elimination does not work sufficiently well, but causes a disturbing echo to the other party. If, on the other hand, the echo canceller in the mobile station is sufficiently good, it is unnecessary to perform a new echo cancellation operation in the fixed parts of the network. This could also deteriorate speech quality.
Similar negotiations may also be conducted when the use of a noise canceller is negotiated on. It is also to be be presumed that similar needs will subsequently arise when new features are incorporated into mobile communications systems.
For the above and similar situations, the mobile communications system needs a mechanism by which the sender and receiver (e.g. a mobile station and a transcoder) can send messages to one another, for example when informing one another of their type and/or in negotiating with one another on the speech coding or echo cancellation method to be employed.
It is known to use the same physical transmission channel to transmit both speech and digital information. For example U.S. Pat. No. 4,476,559 (Berlin et al.) teaches a technique suitable for fixed networks wherein one of three transmission modes (voice, data, or a combination thereof is selected, and an identifier for indicating the chosen transmission mode is formed. In Berlin, this identifier is called xe2x80x9ca unique signaturexe2x80x9d. This signature is multiplexed with the transmit signals for identification of the transmission mode. However, there are several reasons why the solution offered by said U.S. Patent (Berlin) is not suitable in an environment where the present invention is intended to be used. First, in accordance with said U.S. Patent (Berlin), a portion of the bandwidth is reserved at all times for indicating the chosen transmission mode, wherefore the entire bandwidth is not available for transmitting speech, not even when there are no messages to be transmitted. In a mobile communications system, and particularly at the air interface thereof, this would be an intolerable limitation. Second, the technique of said U.S. Patent (Berlin) is based on the assumption that the unique signature indicating the transmission mode can always be received faultlessly. In a transmission over the air interface, such an assumption cannot be made.
Therefore, it is an object of the present invention to provide a signalling protocol and an apparatus implementing a signalling protocol wherewith new functional features can be added to an existing mobile communications system and negotiations can be conducted on the use of these features. Equipment already installed in the system (xe2x80x9coldxe2x80x9d apparatus) and the users thereof should be disturbed as little as possible. The messages must be formed in such a way that they can be received with maximum reliability. The objects of the invention are achieved with a method, signalling, and an apparatus that are characterized by what is set forth in the characterizing portions of the independent claims. The preferred embodiments of the invention are set forth the dependent claims.
The invention is based on the idea that in transmission over the air interface, part of the frames are corrupted anyway. Change of one speech frame may cause a perceptible snap in the speech. However, the listener can infer the missing information from the context. To correct transmission errors, mobile communications systems have usually implemented mechanisms for replacing bad speech frames (for example, entirely or partly with a preceding good speech frame). When this technique is used, the missing of one frame is normally not even detected. As was stated above, a xe2x80x9cspeech framexe2x80x9d generally means a frame that is used in the system concerned to transmit information, such as speech, music or other sound, a video signal, or multimedia. A xe2x80x9cbadxe2x80x9d frame within the context of the present application means a frame wherefrom the receiver can conclude that the frame should not be treated as a normal good frame. In the case of the exemplary GSM system, a bad frame can be detected by means of the cyclic redundancy check (CRC) value.
In accordance with the invention, messages are transmitted in a common channel with the information to be sent from the transmitter to the receiver in such a way that the speech frame corresponding to the message is marked as bad (for example by inserting a faulty CRC value in the frame), and the bit pattern corresponding to the message is inserted in one or more frames. Frames are xe2x80x9cstolenxe2x80x9d for message transmission only for very short periods of time and only for the exact duration of the message transmission, whereas at other times the entire channel is normally available for information transfer. In the present context, the concept of a xe2x80x9cshort-term messagexe2x80x9d means a message that is so shortxe2x80x94usually having the length of one speech frame onlyxe2x80x94that it can be sent in the same channel with the information to be transmitted, without the intelligibility of the receive signal being substantially impaired. In practical situations, the message transmission in accordance with the invention does not normally impair the quality of the reception at all. This is due to the fact that such messages are mainly needed at the very start of the connection only, when the sender and receiver (e.g. a mobile station and a fixed network part) negotiate on the use of a speech codec, an echo canceller and/or a noise canceller. Such negotiations can be conducted during the setup of the signalling connection but prior to the parties initiating the actual information transfer, e.g. speech. If the call is placed in a mobile communications system and the user of the mobile station moves to the service area of another base station and/or transcoder during the call, the negotiations must naturally be re-conducted during the call. Even in that case, the technique for replacing bad speech frames which is commonly used in mobile communications networks will mask the message, so that the effect on speech quality is practically non-existent.
Occasional messagesxe2x80x94for example for selecting a speech codec and for controlling an echo canceller or a noise cancellerxe2x80x94will be very short. In such a case, the bandwidth employed for speech is not appreciably reduced, even though redundancy is added to the messages to correct transmission errors. Redundancy may be added for example by utilizing channel coding which is also otherwise effected in the system and which in the case of the exemplary GSM system is implemented with convolutional coding. Other ways of adding redundancy are disclosed in connection with the preferred embodiments of the invention.
The advantages of the signalling method of the present invention include first of all the fact that it allows new properties to be added to existing telecommunication systems. A system may comprise both xe2x80x9cnewxe2x80x9d equipment (in which signalling in accordance with the invention is implemented) and xe2x80x9coldxe2x80x9d equipment (which do not comprise the technique of the invention). When a new device communicates with another new device, messages in accordance with the invention are transmitted between the transmitter and the receiver without disturbing the speech connection. When a new device communicates with an old device, the messages transmitted by the new device are not received, but the speech connection is not disturbed either. A receiver employing the method of the invention can detect a message coded into the speech frame and interpret it without the speech connection being essentially disturbed; no further information is required for detecting the message. No special speech frame corresponding to the xe2x80x9cunique signaturexe2x80x9d of the above-mentioned U.S. Patent is thus required in the present invention to indicate whether information on the channel is to be interpreted as speech or as a message. A receiver in which the message transmission of the invention is not implemented cannot detect a message coded among speech coding bits, but the existence of the message does not essentially disturb the speech connection. The messages are identified simply in such a way that the receiver detects a bad speech frame and examines whether it contains bit patterns deviating from a predetermined bit pattern corresponding to the message in a few bits at most. Since the messages in accordance with the invention are transferred in a channel assigned for user traffic, the messages may have freely selected bit patterns. Hence, no risk can arise for any standardizing body in the field to reserve a bit pattern corresponding to a message in accordance with the invention for a specific use.
Since the entire channel is normally available for speech transmission except for the moment of sending the message, the technique in accordance with the invention does not reduce the capacity of the speech channel. In theory, the technique of the invention slightly reduces speech quality at the time of sending the message, but experience has shown that the listener is not capable of detecting the missing of one speech frame if the bad or missing speech frame is replaced with a preceding good speech frame. On account of the advantageous selection of redundancy and bit patterns corresponding to the messages, the technique in accordance with the invention is reliable against transmission disturbances.