1. Field of the Invention
The present invention is concerned with the field of communications and more particularly to a method and system for Analog Simultaneous transmission of Voice and Data (ASVD) over an analog channel such as, for example, a telephone line.
In the present specification and in the appended claims, the term "voice" is intended to include speech as well as other sound signals; the ASVD method and system according to the invention are not restricted to voice but can be used for music and other sound signals.
2. Brief Description of the Prior Art
Data in digital format are sent over analog channels using modem devices which perform the necessary modulation and demodulation operations. The International Telecommunication Union--Telecommunication sector (ITU-T) is an international standardization body which makes recommendation relating to modem specifications. Recommendation V. 34 of the ITU-T describes the specifications of a modem operating at data signalling rates of up to 33 600 bits/second for use on the general switched telephone network and on leased point-to-point 2-wire telephone-type circuits. Also, recommendation G. 729A of the ITU-T describes the specifications of an 8 kbits/second voice encoder using Conjugate-Structure Algebraic-Code Excited Linear Prediction (CS-ACELP). Recommendation G. 729A is also known as the DSVD standard since it is the ITU-T recommendation for Digital Simultaneous transmission of Voice and Data.
An alternate method to using the DVSD standard is the so-called ASVD (Analog Simultaneous transmission of Voice and Data) proposal. This alternate method has been considered by the ITU-T for possible standardization with the V. 35 modem recommendation under the proposed recommendation number V. 34Q.
The ASVD methods proposed in prior art are based on the well known speech transmission approach called Residual-Excited Linear Prediction (RELP). According to this approach, the transmitted voice is obtained by filtering a so-called residual signal through a cascade of two time-varying filters called the pitch synthesis filter and the LP (Linear Prediction) synthesis filter. In ASVD, as in RELP, the coefficients of both the pitch and LP synthesis filters are digitally encoded and updated on a regular basis. The sampled residual signal is also digitally encoded in RELP; this is in contrast to ASVD wherein the residual-signal samples are added, so to speak, to the modulation. More precisely, the first and second samples of each successive pair of residual samples is added to the In-phase and In-quadrature components, respectively, of the associated modulation scheme.
According to this method residual samples can be transmitted at twice the modem Baud rate. Note that these residual samples can be viewed as artificial channel noise. This observation entails the two following facts:
First, residual samples must be scaled so as to be confined to a safe amplitude range in order not to interfere with the proper modem operation; and PA1 Secondly, if the channel condition is very good, the scaled but unquantized residual samples will be received with very little degradation due to the added true channel noise. PA1 1. Limited to point-to-point modem connections and inability to support other transport mechanisms such as the DSVD standard (G. 729A); PA1 2. Inability to operate at bit rates of 8 kbits/second and below resulting in very low data throughput; PA1 3. Lack of a convenient method to provide voice security through data encryption; and PA1 4. Reduced speech and audio bandwidth since the audio sampling rate is restricted to be twice the modem Baud rate. For instance, in the V. 34 modem case, the Baud rate ranges from 2400 to 3429. It follows that, in the worst case, the speech and audio signal is sampled at 4800 samples/second; band limiting the signal below 2400 Hz is then required to prevent aliasing. The approach results in a poor transmission quality specially for fricatives and music signals. PA1 the fractional sampling-rate conversion comprises shifting the Baud rate by factor N/M, where N and M are integers that verify, or closely verify, the following equality: ##EQU2## where F is the voice signal sampling rate and R is the Baud rate; the fractional sampling-rate conversion is a fractional up-sampling-rate conversion comprising expanding the sequence of residual vectors by inserting N-1 zeros between each pair of successive samples of the sequence of residual vectors, and low-pass filtering the expanded sequence with a normalized cut-off frequency of .pi./M radians to produce a low-pass filtered signal decimated by the factor M; PA1 the sampled voice signal and the sequence of residual vectors are formed of samples grouped into successive frames having a fixed duration, the expanded sequence of residual vectors is low-pass filtered through a non causal low-pass filter having a finite impulse response with 1+2.times.J.times.M non-zero coefficients, J being an integer, and J temporary zero samples of the forthcoming frame of the sequence of residual vectors are used to perform the non causal low-pass filtering without causing perceptual distortion; and PA1 the low-pass filtering step with a normalized cut-off frequency of .pi./M radians comprises introducing aliasing between .pi./N and .pi./M radians to fill-in for a missing high-frequency band of the residual sampled voice signal with no perceptual consequence.
Some of the major shortcomings of the ASVD prior art methods are the following: