1. Field of the Invention
The invention relates to digital communications networks such as the Integrated Services Digital Network (ISDN) and digital radio systems and, more particularly, to the alternate transmission of speech and data communications in such networks and systems.
2. Description of the Related Art
The related art described herein consists of that related primarily to radio communication because it is in that field that the majority of work has been done regarding the alternating transmission of speech and data. However, the present invention is equally applicable to any terminals which have an associated coder or decoder that may be selectively bypassed under control of the system of the present invention.
Traditionally, both wireline and radio communication networks have been used for the transmission of speech information from one point in the network to another. Recent advances in computer and communications technologies indicate that the dominant use of both radio and wireline communication networks in the future may be for data communications, not voice. The recent proliferation of so-called "multimedia" applications and services frequently calls for the combined usage of both speech and data in a single user application. The implementation of such applications within the mobile radio network requires that subscribers be able to transmit both speech and data either simultaneously or alternately.
The increasing demand for such multimedia services within digital cellular radio systems requires fast and flexible transmission of both speech and data within the network. While many of these potential applications do not require a full simultaneous transmission of speech and data at the same time, they can function very efficiently if speech and data can alternate between one another very quickly and flexibly within the system.
For example, a mobile user might desire a feature such as voice controlled automatic call routing. This service is implemented by having the subscriber phone a routing server connected within the fixed network and then inform the server, by means of voice recognition, to reroute all incoming calls from the subscriber's regular number to a certain specified alternate number. This service requires the transmission of both data, for the control commands to the server, and voice parameters for the spoken commands. Another possible application requiring the alternate transmission of speech and data is the use of voice enabled e-mail in which a user dials into a server which is connected within the fixed network and functions as a e-mail box. The user issues commands to the server via data transmissions to cause the server to display on the user's terminal a list of received e-mails and enable the user to scroll up and down that list and then request the server to read a selected one of those e-mails by voice synthesis. In this application, the user alternatively sends digital control commands and speech vectors between the user's terminal and the server. Still other "multimedia" applications are file transfer of digital data over the same connection while a user is talking to speech recognition software within the server, and the implementation of video conferencing over a single connection.
Many prior art references have contemplated the multiplexing of speech and data in a single communication channel. For example, in U.S. Pat. No. 4,813,040 entitled "Method and Apparatus for Transmitting Digital Data and Real-Time Digitized Voice Information Over Communication Channels" issued Mar. 14, 1989 to Futato, data are inserted into the silence periods of voice communications on a communication channel. Similarly, in PCT published application no. W096/13916 entitled "Communications Method and Apparatus With Transmission of a Second Signal During Absence of a First One" the system transmits both a principal signal (voice) and a data signal. When the principal signal is present or contains information it is transmitted; however, when the principal signal is absent or does not contain a significant amount of information, data are transmitted through the channel. Neither of these systems contemplate solutions to the problems of alternate voice and data transmissions over a link including digital radio.
In digital radio systems such as TDMA digital cellular systems, digital speech content and digital data content are handled differently in the system. When the user speaks into a subscriber terminal of a digital radio system, the voice is encoded into speech parameters which are, in the full rate (FR) coding scheme of the Global System for Mobile communication (GSM), transmitted at 260 bits per 20 millisecond frame. This is a data rate of 13 K bits per second. When these encoded speech parameters reach the fixed network, they are conventionally converted by a speech decoder into normal 8K digital speech samples and transmitted at the rate of 64 K bits per second. In contrast, data are generally transmitted in the fixed network in accordance with somewhat different standards because of the inherently different characteristics between voice and data communications.
It is important for speech to undergo very few delays during transmission so that the other party receives it within a time frame which simulates normal conversation. The nature of digital speech is also such that errors in the digital representations of the speech are quite tolerable. Speech is redundant and the listener is also redundant so that communication is satisfactory and readily understandable even though a number of errors may occur in the transmission of the digital speech representations from one location to another. Data, on the other hand, is very intolerant of errors. Thus, it must be encoded with error correction coding and other techniques to ensure a high degree of accuracy in the transmission of the data from one point to another within a communication network. On the other hand, the delays in the transmission of data from one point to another are very tolerable in the case of data circuits. It does not usually matter that the data is delayed or buffered at various points in the transmission circuitry while the data is moving from one place to another within the network.
Because of these very different ways of handling speech and data in the communication network, it is infrequent that both can be efficiently transmitted in the same communication circuit. For example, in the multimedia facilities currently provided within the GSM cellular network, the speech portions of a circuit are handled by one set of infrastructure and the data portion of such circuits are handled by a different data path infrastructure. This results in a lack of synchronization of the two paths which make it difficult to implement services involving both. Thus, it is very difficult to combine in a single application, the alternate transmission of speech and data between two separate nodes in the network, especially one which includes a digital radio link.
The current GSM standards provide a variety of different service and traffic channels such as the three speech traffic channels full rate (FR), half rate (HR) and enhanced full rate (EFR), as well as many different types of data traffic channels. While GSM recommendations exist which describe solutions for simultaneous or alternative transmission of speech and data within one call, the practical realization of multimedia services suffers in many different aspects from insufficient specifications to insufficient realizations and insufficient support of the proposals by network operators.
For example, some of the drawbacks to existing suggested solutions include the fact that existing data services are not suitable for speech transmission due to long delays while current speech services are not transparent to data. In addition, "mode modification" between these two types of services are much too slow and cumbersome for practical implementation. Current solutions like USSD for GSM can carry slow speed data in parallel to a speech channel; however, unlike speech transmission, the USSD data is terminated in the fixed network and is not transparent. Furthermore, the delay for interactive data is greater than one second which produces unacceptably slow responses to user commands.
Additionally, dual tone multi frequency (DTMF) commands are often used for user services such as voice mail boxes. While the latency is relatively low and these commands are sent transparently through the network, the data rate is slow and DTMF is normally only implemented for signaling from a mobile and not in the other direction toward the mobile. Furthermore, current connections between a digital mobile station and an internet protocol (IP) phone either use a data connection or a gateway which converts speech to UDP/IP. The data connection is not optimized for speech like coded speech for radio and the delay is even longer due to more interleaving. The use of a gateway requires powerful computing in order to handle speech coding for several concurrent connections. Thus, in summary, the solution of how to achieve transparent speech as well as low latency data end-to-end for digital cellular phones, IP phones, service nodes, and the like has not been found within conventional prior art techniques.
An innovation in the specifications of the GSM cellular network which was recently promulgated, and which will be shortly adopted by the European Telecommunications Standard Institute (ETSI), is that set forth in section GSM 04.53, draft version 0.1.3, entitled "Inband Tandem Free Operation (TFO) of Speech Codecs." This innovation relates to an effort to improve the quality of speech communication between two subscribers in the case of a mobile terminal to mobile terminal call within the GSM network. As mentioned above, the conventional way of handling a speech call within a digital radio network, such as the GSM network, is to initially encode the speaker's voice at the mobile terminal into digital speech parameters representing certain characteristics of the output of the microphone in the terminal. For example, some parameters describe the spectral envelope of the speech signal, other parameters describe the volume and still others characterize the fine structure of the speech material. These encoded speech parameters are then transmitted at 13K bits per second via the radio interface to the fixed network where they are decoded into digital signals representing a voice signal sampled at the standard rate of 8 K samples per second. This signal is then transmitted through the fixed network to the terminating end of the conversation, which in the case of a mobile-to-mobile call, is another radio base station. Here the signal is again encoded from speech samples into speech parameters and transmitted over the air interface at 13 K bits per second. At the subscriber's mobile terminal the speech parameters are again decoded into an electrical representation of a voice signal for the loud speaker in the terminal. It is well known that each of these encoding and decoding operations are lossey in nature; that is, each time the signal is encoded and decoded a certain amount of error creeps into the signal resulting in a degradation of the voice signal from that which was originally spoken into the microphone. The purpose of the TFO scheme is to eliminate unnecessary encodings and decodings of the voice signals in the case of a mobile-to-mobile call. That is, with TFO functionality enabled, the encoded voice parameters transmitted over the air interface from the originating mobile station at 13 K bits per second are not decoded when they are received at the fixed network. Rather, they are transmitted transparently through the fixed network as 13 K bits per second speech parameters and from there back out over the air interface to the receiving mobile terminal. There the speech parameters are decoded into speech signals and applied to the loud speaker of the receiver's terminal. This eliminates one complete cycle of encoding and decoding while the signal passes through the fixed network and results in a considerably higher quality signal at the other end.
Referring to FIG. 1, there is shown a block diagram of the prior art implementation of TFO functionality within the GSM network. FIG. 1 depicts the functional entities for handling tandem free operation in a mobile station to mobile station call. A first mobile switching center (MSC1) is connected to communicate with a first base station controller (BSC1) which is in turn connected to a base transceiver station (BTS1) in turn connected via radio to a radio terminal (MS1). A tandem free operation-transcoder and rate adapter unit (TFO-TRAU1), which is physically part of either BTS1, BSC1 or MSC1, but here shown separately, is imposed within both the uplink (UL) and downlink (DL) to BTS1. In the uplink a decoder 11 is connected in parallel with a TFO transmitter (TFO-TX) 12 and their output signals are added at 13. On the downlink, an encoder 14 and a TFO-RX 15 have their outputs alternatively and selectively connectable through a switch 16.
Similarly, MSC2 is connected to BSC2 which is in turn connected to BTS2 in turn connected via radio to a radio terminal MS2. The second TFO-TRAU2, which is physically part of either BTS2. BSC2 or MSC2, but here shown separately, includes in the downlink an encoder 21 connected in parallel with TFO-RX 22 the outputs of which are selectively and alternatively connectable through a switch 23. In the uplink, a (speech) decoder 24 and TFO-TX 25 have their outputs connected through a replacement unit 26, symbolized by a "+" sign. In the TFO operation this unit replaces the one or two LSB of the PCM octets (digital representation of each speech sample) by one or two bits of the TFO frame (HR or FR cases, respectively). The TRAUs are controlled by the BTSs and the speech/data information and TRAU control signals are exchanged between the channel codec unit (CCU) in the BTS and the TRAU and are transferred in frames denoted "TRAU Frames." In tandem free operation similar frames are transported on the A interface between the TRAUs and denoted "TFO speech frames." In addition to these frames, signaling information is also transferred on the A interface using "TFO negotiation messages" which are mapped to the least significant bit of the PCM octets. As illustrated in the reference model of FIG. 1, when TRO operation is enabled, a transparent digital link is provided through the wire-bound network, in both directions, from the input of the speech decoder of one mobile to the output of the speech encoder of a second mobile. Since the GSM full rate speech traffic channel has 260 bits every 20 millisecond frame, these 260 bits are forward error encoded using an unequal error protection scheme and transmitted in packets of 456 bits within a 20 millisecond frame.
TFO-like schemes are also being proposed and implemented in other digital systems such as the U.S. D-AMPS standard pursuant to IS-54 and IS-136 and the Japanese digital standard (JDC). It would be desirable if TFO functionality in each of these digital systems could be used as part of a technique to alternately transmit speech and data within a digital radio network.