1. Field
The present disclosure generally relates to data transmission over a speech channel. More specifically, the disclosure relates to transmitting non-speech information through a speech codec (in-band) in a communication network.
2. Description of Related Art
Transmission of speech has been a mainstay in communications systems since the advent of the fixed line telephone and wireless radio. Advances in communications systems research and design have moved the industry toward digital based systems. One benefit of a digital communication system is the ability to reduce required transmission bandwidth by implementing compression on the data to be transferred. As a result, much research and development has gone into compression techniques, especially in the area of speech coding. A common speech compression apparatus is a “vocoder” and is also interchangeably referred to as a “speech codec” or “speech coder.” The vocoder receives digitized speech samples and produces collections of data bits known as “speech packets”. Several standardized vocoding algorithms exist in support of the different digital communication systems which require speech communication, and in fact speech support is a minimum and essential requirement in most communication systems today. The 3rd Generation Partnership Project 2 (3GPP2) is an example standardization organization which specifies the IS-95, CDMA2000 1xRTT (1x Radio Transmission Technology), CDMA2000 EV-DO (Evolution-Data Optimized), and CDMA2000 EV-DV (Evolution-Data/Voice) communication systems. The 3rd Generation Partnership Project is another example standardization organization which specifies the GSM (Global System for Mobile Communications), UMTS (Universal Mobile Telecommunications System), HSDPA (High-Speed Downlink Packet Access), HSUPA (High-Speed Uplink Packet Access), HSPA+ (High-Speed Packet Access Evolution), and LTE (Long Term Evolution). The VoIP (Voice over Internet Protocol) is an example protocol used in the communication systems defined in 3GPP and 3GPP2, as well as others. Examples of vocoders employed in such communication systems and protocols include ITU-T G.729 (International Telecommunications Union), AMR (Adaptive Multi-rate Speech Codec), and EVRC (Enhanced Variable Rate Codec Speech Service Options 3, 68, 70).
Information sharing is a primary goal of today's communication systems in support of the demand for instant and ubiquitous connectivity. Users of today's communication systems transfer speech, video, text messages, and other data to stay connected. New applications being developed tend to outpace the evolution of the networks and may require upgrades to the communication system modulation schemes and protocols. In some remote geographical areas only speech services may be available due to a lack of infrastructure support for advanced data services in the system. Alternatively, users may choose to only enable speech services on their communications device due to economic reasons. In some countries, public services support is mandated in the communication network, such as Emergency 911 (E911) or in-vehicle emergency call (eCall). In these emergency application examples, fast data transfer is a priority but not always realistic especially when advanced data services are not available at the user terminal. Previous techniques have provided solutions to transmit data through a speech codec, but these solutions are only able to support low data rate transfers due to the coding inefficiencies incurred when trying to encode a non-speech signal with a vocoder.
The speech compression algorithms implemented by most vocoders utilize “analysis by synthesis” techniques to model the human vocal tract with sets of parameters. The sets of parameters commonly include functions of digital filter coefficients, gains, and stored signals known as codebooks to name a few. A search for the parameters which most closely match the input speech signal characteristics is performed at the vocoder's encoder. The parameters are then used at the vocoder's decoder to synthesize an estimate of the input speech. The parameter sets available to the vocoder to encode the signals are tuned to best model speech characterized by voiced periodic segments as well as unvoiced segments which have noise-like characteristics. Signals which do not contain periodic or noise-like characteristics are not effectively encoded by the vocoder and may result in severe distortion at the decoded output in some cases. Examples of signals which do not exhibit speech characteristics include rapidly changing single frequency “tone” signals or dual tone multiple frequency “DTMF” signals. Most vocoders are unable to efficiently and effectively encode such signals.
Transmitting data through a speech codec is commonly referred to as transmitting data “in-band”, wherein the data is incorporated into one or more speech packets output from the speech codec. Several techniques use audio tones at predetermined frequencies within the speech frequency band to represent the data. Using predetermined frequency tones to transfer data through speech codecs, especially at higher data rates, is unreliable due to the vocoders employed in the systems. The vocoders are designed to model speech signals using a limited number of parameters. The limited parameters are insufficient to effectively model the tone signals. The ability of the vocoders to model the tones is further degraded when attempting to increase the transmission data rate by changing the tones quickly. This affects the detection accuracy and results in the need to add complex schemes to minimize the data errors which in turn further reduces the overall data rate of the communication system. Therefore, a need arises to efficiently and effectively transmit data through a speech codec in a communication network.
Accordingly it would be advantageous to provide an improved system for transmitting and receiving information through a speech codec in a communications network.