The present invention relates to DTMF code detection, and more particularly, to methods and apparatuses for reducing the false detection probability on speech signals.
Dual Tone Multi-Frequency (DTMF) signaling is a standardized technique for communicating information in telecommunications systems. The DTMF code signaling method uses sixteen different codes. A DTMF code is a pulse that is the sum of two sine waves, one of which has a frequency that is selected from a low band frequency group, and the second of which has a frequency that is selected from a high band frequency group. In practice, the low band frequency group comprises the frequencies 697, 770, 852 and 941 Hz, while the high band frequency group comprises the frequencies 1209, 1336, 1477 and 1633 Hz. The pulse is either preceded or followed by a pause such that the duration of both pulse and pause exceeds specified limits.
A DTMF detector is a component that receives a signal, and determines whether that signal is a DTMF signal. Such a device is necessary because DTMF signals are usually transmitted on communications lines that transmit other types of signals as well, such as voice signals. A well-known problem in the art of DTMF detection is providing high speech immunity, that is, keeping the probability that a speech signal will falsely be identified as a DTMF signal low.
A known method of detecting DTMF codes based on linear predictive analysis is described in "Application Note: Linear Prediction Based DTMF Detection for the WE DSP32 Digital Signal Processor Family" published by AT&T. According to this prior art method, the DTMF pulse is assumed to have been produced by the signal model shown in FIG. 1(a). White noise from two independent random sources, .eta..sub.1n and .eta..sub.2n, is used to excite respective first and second filters 101, 103. The first filter 101 operates in accordance with ##EQU1## Similarly, the second filter 103 operates in accordance with ##EQU2## Each of these filters has a set of complex conjugate poles on the unit circle of the Z-plane with angles respectively corresponding to the low and high tone frequency. The outputs from the first and second filters 101, 103 are summed, in combining means 105, to generate the DTMF signal. Because each of the first and second filters 101, 103 is an all pole filter of second order, each output signal is an Auto Regressive process of order 2 (henceforth, "an AR(2) process". See Johansson and Forsen, "Modellorienterad Signalbehandling", Institutionen for Teletransmissionsteori, KTH, September 1985.
In the theory of AR-processes, the filter coefficients of A(z) and B(z) can be estimated on the basis of autocorrelation function (acf) values of the respective low band and high band signals using the Yule-Walker normal equations. Filters are used to separate the received DTMF signal into low band and high band signals. The low and high band signals are each sampled at 4 kHz. Each of the low band and high band signals is then separately analyzed using the formulas set forth as follows:
In matrix notation, the normal equation is: EQU Ra=r (1)
where R is a 2.times.2 matrix with acf-values as follows: ##EQU3## where r.sub.y (l,m)=E[s.sub.n-l, s.sub.n-m ]. E[ . . . ] denotes the expectation value. In equation (2), the relationship r.sub.y (2,1)=r.sub.y (1,2) exists.
In equation (1), the variable a is a 2-element column vector with the AR-coefficients. For the first filter 101, the vector for A(z) is ##EQU4##
The variable r in equation (1) is 2-element column vector with acf-values of: ##EQU5##
The acf-values may be estimated by a number of techniques. One possibility is by using the recursion formulas: EQU R=.lambda.R+y.sub.n-1 y.sub.n-1.sup.T (5)
r=.lambda.r+s.sub.n y.sub.n-1 (6)
where s.sub.n is the present sample and .lambda. is the forgetting factor, which is a number slightly less than one, and y.sub.n =[s.sub.n,s.sub.n-1 ].sup.T. For A(z), s.sub.n is the present sample in the low band signal.
The roots of the filter A(z)=1-a.sub.1 z.sup.-1 -a.sub.2 z.sup.-2 are given by: ##EQU6## For these roots to form a complex conjugate pair corresponding to a sinusoid of non-zero frequency, a.sub.2 must be negative with magnitude greater than a.sub.1.sup.2 /4. Therefore, equation (7) can be expressed as: ##EQU7##
The magnitude of z is: EQU .vertline.z.vertline.=.sqroot.-a.sub.2 (9)
and the angle in the upper half of the z-plane is ##EQU8## FIG. 1(b) illustrates the pole locations (denoted by "x") of the filter 1/A(z) in the z-plane.
Solving the normal equations for the vector a of equation (3), it is found that ##EQU9## and ##EQU10## where EQU detR=r.sub.y (1,1)r.sub.y (2,2)-r.sub.y (1,2).sup.2 (13)
is the determinant of R, EQU a.sub.1 =r.sub.y (0,1)r.sub.y (2,2)-r.sub.y (1,2)r.sub.y (0,2)(14) EQU and EQU a.sub.2 =r.sub.y (1,1)r.sub.y (0,2)-r.sub.y (1,2)r.sub.y (0,1)(15)
For each DTMF frequency, the detector should accept .theta. within the range EQU .theta..sub.f.sbsb.j.sub.L &lt;.theta.&lt;.theta..sub.f.sbsb.j.sub.H(16)
as valid, where .theta..sub.f.sbsb.j.sub.L is the low frequency threshold and .theta..sub.f.sbsb.j.sub.H is the high frequency threshold of DTMF frequency j. It is noted that these threshold values are application-specific. They depend on the deviation from the nominal DTMF frequency that is to be accepted as a DTMF signal.
Applying the relationships of equations (11), (12) and (13) to equation (10), the acceptance criterion of equation (16) can be rewritten as EQU (.sqroot.(1+tan.sup.2 .theta..sub.f.sbsb.j.sub.L)a.sub.1).sup.2 +4detRa.sub.2 .ltoreq.0.ltoreq.(.sqroot.(1+tan.sup.2 .theta..sub.f.sbsb.j.sub.H)a.sub.1).sup.2 +4detRa.sub.2 (17)
for the low band frequencies 0 to 1000 Hz.
For the high band signal, the vector a is written ##EQU11## and with this change, the acceptance criterion is EQU (.sqroot.(1+tan.sup.2 .theta..sub.f.sbsb.j.sub.L)a.sub.1).sup.2 +4detRa.sub.2 .gtoreq.0.gtoreq.(.sqroot.(1+tan.sup.2 .theta..sub.f.sbsb.j.sub.H)a.sub.1).sup.2 +4detRa.sub.2 (19)
for the high band frequencies, 1000 to 2000 Hz. It is noted that the acceptance criteria shown in equations (17) and (19) are preferred formulations. However, they may alternatively be formulated in other ways, such as that which is illustrated in the above-referenced AT&T publication.
If a DTMF pulse is present, the polynomials A(z) and B(z) will have roots close to the unit circle at angles corresponding to DTMF frequencies in both the low and high bands. One of the principles upon which the detector is based is that if speech is present, it is unlikely that the roots are close to the unit circle at angles corresponding to DTMF frequencies in both the low and high bands simultaneously. The fact that the roots are expected to be close to the unit circle can be utilized to enhance the speech immunity of the detector.
Using equations (9) and (12), the magnitude M can be written as ##EQU12##
With a minimum acceptable squared magnitude threshold, designated M.sub.thresh.sup.2, the requirement for acceptance can be written: EQU a.sub.2 +detR M.sub.thresh.sup.2 &lt;0 (21)
It is noted that the value of the squared magnitude threshold, M.sub.thresh.sup.2, may be set individually for each DTMF frequency, just as it is with the frequency thresholds. In an alternative embodiment, it is, of course, also possible to calculate the magnitude, M, itself and compare that to a corresponding threshold, but that is a very inefficient way to do it.
The detection method divides the low and high band signal into frames where each frame is T ms. Autocorrelation function (acf) values are calculated for each frame. At the end of each frame, a linear predictive analysis is performed on the basis of the acf values.
The detector is based on results from the linear predictive analysis over a sequence of frames. To identify the pulse part, the magnitude and the frequency need to pass the acceptance criteria formulated in expressions (17), (19) and (20). To approve a DTMF code, at least P number of frames indicating the same DTMF pulse needs to be received, where P depends on the frame length T and on the minimum pulse length that is to be detected. There are also requirements on the pause part in the signaling but a discussion of this requirement is not relevant in the following discussion.
A system has been tested in which the input signal was separated into low and high band signals (sampled at 4 kHz) by means of digital filtering, and in which the above-described analysis was applied to each of the resultant low and high band signals. The frame length in this test system was 9 ms.
It was discovered through testing that the squared magnitude measurements over a DTMF pulse part often results in one of the frames having a squared magnitude that deviates much more from the unit circle than the other frames. This fact must be taken into consideration when defining the acceptance criteria for the squared magnitude, and results in a magnitude threshold that is less exacting. Consequently, the false detection probability on speech signals is increased.