This invention relates to a method and an apparatus for detecting and discriminating between various types of electronic signals. More specifically it relates to telephone call processing and is particularly applicable to multi-frequency (MF), dual-tone multi-frequency (DTMF), call progress tone (CPT) and MF-R2 tone receivers specifically in the area of telephone networks.
Telephone service providers increasingly supply a wide variety of options and features to subscribers, such as call waiting, three-way calling, credit card calling among many others. All these services are implemented to provide customers with conveniences and facilities that were unavailable a few years back. In order to achieve communication between various units of a telephone network and with users and hence provide these services, telephone systems require various types of control signals. These signals include tones which are used to convey information to the system from the user or, alternatively, to inform the user of the current status of a telephone transaction. Tones may also be used to communicate between switches and other parts of the telephone network itself.
The correct detection of a tone is crucial to the functioning of the telephone network since the latter relies on them for operations such as dialling, billing and coin identification. Users also rely on these tones for information such as busy, dialling and dial tone. As a concrete example, automatic redial when faced with a billing signal and recognition of an incoming fax would not be possible without accurate tone recognition.
Four categories of signalling are commonly used each with its own specifications and purposes. They are multi-frequency (MF) tones, dual-tone multi-frequency tones (DTMF), international MF-R2 tones and call progress tones (CPT). There are other signalling sequences that are not mentioned here since their purposes are similar in nature to the four signalling conventions mentioned above. Two separate devices are involved in the tone communication process: the transmitter, which creates and propagates the tones, and the receiver, which receives and decodes them.
The correct detection of a tone by the receiver implies that the signal originating from a transmitter station is accurately decoded by the receiver as being the transmitted tone. Correct detection by the receiver of a digit encoded using any signalling methods also requires both a valid combination of frequencies and the correct timing element. A valid combination of frequencies implies that the receiver is able to discern that there is only the specified frequencies present and that these frequencies are located at or at least within reasonable distance of the nominal values. Furthermore the receiver is also able to verify within reasonable accuracy that the amplitude, twist and other characteristics of the tone conforms to certain pre-determined values. The receiver should also be able to obtain within reasonable accuracy the duration of a tone in order to determine if it is valid when compared to pre-determined duration requirements such as inter-tone gaps, cadence and other temporal specifications.
Detection of tones is a problem that has been addressed in the past. For example Bennett et al., U.S. Pat. No. 5,311,589 assigned to ATandT Bell Laboratories, describes a method to process DTMF and CPT tone using the Goertzel algorithm and a logical processing stage. The contents of his document are incorporated herein by reference.
Tone receiver systems have been developed in many parts of the world and, although it is difficult to describe a standard tone receiver architecture, some characteristics are shared between many of them. FIG. 1 illustrates a typical tone receiver system as present in the prior art. A typical tone communication system of the type depicted in FIG. 1 generally comprises a device such as a transmitter 102 which encodes and transmits pulses serially in a communication channel 104. The transmitter 102 could be a simple touch-tone telephone or be included as part of a telephone switch. A combination of the pulses transmitted constitutes a tone which typically has frequencies in the voice range (xcx9c180-3600 Hz). Hence, in a telephone network, the communication channel is the voice channel. At the other end of the channel a receiver 100 is connected. The receiver 100 monitors the communication channel, detects and decodes the signal and, in turn, transmits the decoded information to another device such as a controller 116.
The receiver 100 can be separated into five functional blocks namely an anti-aliasing low-pass filter 105, an analog to digital (A/D) converter 106, a storage buffer unit 108, a spectral processor 110 and a logical processor 112. Although a few receivers still use the analog signal directly, the trend is clearly towards digitization because it can be processed by a digital computer and be implemented on an easily programmable DSP chip. Therefore, the prior system shown here uses digital signals; however, the same explanations are valid for analog receiver which use similar analog components. In certain circumstances, the incoming signal may be digital. In these cases the A/D converter 106 would not be required. Typically, the incoming signal is digitally sampled by an analog to digital (A/D) 106 converter and assembled into frames in a storage buffer unit 108. These frames are then analysed at predetermined frequencies in order to obtain the frequency characteristics of the signal during that frame. Generally, the analysis is performed in two separate units 110112.
The spectral processor 110 analyses the spectral characteristics of the samples received from the storage buffer unit 108 to obtain frequency and amplitude information for each frame. This analysis is similar for all types of signals differing perhaps by the frequencies analysed. Furthermore, this operation requires large computational power and limits the system in its processing capability. Traditionally, the spectral properties of tones have been detected by means of a bank of bandpass filters, one for each possible frequency in the tone. This is shown in FIG. 2 for the detection of a Multi-Frequency (MF) tone. The filters 200202204206208210 may be digital or analog, and are used to estimate the energies of narrow bands of the spectrum in order to obtain a frequency representation of the signal. The bands are centred at the frequencies of interest and their width is chosen to reflect the frequency tolerance of the receiver of each analysed frequency. In the case of MF signalling, a tone is registered as present if and only if there is sufficient energy in two spectral bands. This can be verified by means of devices comprising an energy computation and a pre-determined amplitude threshold 212214216218220222. Another technique that can also be commonly used is to analyse the spectral characteristics of the signals involves the computation of the Discrete Fourier Transform (DFT). Typically the DFTs are computed only at the frequencies of interest and result in an estimation of energy in the frequency domain. This method is described in detail in xe2x80x9cDiscrete-Time Processing of Speech Signalsxe2x80x9d by Deller, Proakis and Hansen, Macmillan Publishing Company New York 1993 whose contents are hereby incorporated by reference. Energy estimates obtained at this stage are propagated to the logical processing stage 112.
The logical processing stage 112 determines, based on the information obtained from the previous stage, if a valid tone has been detected by evaluating the temporal and logical characteristics in the signal. Using the computed amplitude of each frequency, a candidate tone is determined for each frame and is often compared to previous frames for continuity. For instance, in the case of MP signalling, two and only two frequencies must be above the energy threshold. In any other circumstance either the signal should be ignored or an error should be reported. The temporal characteristic involves comparing the duration and cadence of the tone with respect to some reference template. In the case of MF and DTMF signals this temporal analysis is limited to short time duration signals typically in the range of 10 ms to 40 ms. For example, in the case of a DTMF digit with a frame length of 10 ms, three consecutive frames are compared in order to conform to established standards and obtain 30 ms duration for the tone. In the case of a CPT tone, a larger number of frames may be required to verify if the pattern of the given tone matches a predefined pattern with the same frequencies. A detailed description of logical processing can be found in Bennett et al., U.S. Pat. No. 5,311,589 assigned to ATandT Bell Laboratories for both CPT and DTMF tones. Once either a valid tone or no tone has been detected, the result is sent to another device such as a controller 116 that uses the decoded information. The type of computation performed in the logical processing block is generally not computationally intensive and varies a lot between different signalling protocols.
Tone receivers of the type described above have been used in the past in tone detection and recognition systems. Such systems are often integrated into telephone networks where the different modules of the communication channels are able to communicate by sending and receiving various tones. The difficulty concerning the detection of tones is two-fold.
The first problem lies in the fact that many different types of signals are currently in use or will be introduced in the future. A main deficiency of the current systems is their lack of flexibility and the difficulty of reprogramming the Discrete Fourier Transform (DFT) in order to adapt to new tones such as Special Information tones (SIT), fax calling tones, recall dialling tones amongst many others. This difficulty implies considerable efforts in the redesign of new systems to accommodate these new tones and hence considerable costs.
The other problems of detecting tones arise from the nature of the telephone network itself. In the normal course of tone detection it may occur that a speech utterance is mistaken for a tone, that a tone is not detected or that a given tone is mistaken for another. Although dual frequencies in MF, DTMF and CPT were chosen to be non-harmonically related and hence have little resemblance to the harmonic characteristic of the human voice, these tones have frequencies located on the same band as speech (xcx9c180-3600 Hz). Human speech as well, has many fundamental and harmonic frequencies located in that band which causes a situation called xe2x80x9ctalk-offxe2x80x9d. Talk-off occurs when human speech, music or other sounds are mistaken for a tone in a telephone network context. This problem is compounded by noise on the lines since noise may occur in all frequency bands.
Another difficulty resides in providing precise control over the frequency spectrum. About a nominal candidate frequency Fo, which is one of the frequencies of a given tone, a small error margin is accepted to take channel distortions and other physical effects into account. The old band-pass filter technique does not allow precise control of the frequency tolerance bands unless the filter order is very elevated which in turn is costly. As a result, the false classification of a detected tone may occur or alternatively, speech signals may be mistaken for an audible tone.
Another problem lies in the difficulty in providing precise control of the time duration of a signal that is caused by the long. frames required to have a high frequency resolution. In order to have a high frequency resolution in DFT computations and hence be able to distinguish smaller and smaller portions of the frequency spectrum, a long time interval or window is required. By doing so, the time resolution and hence the capacity of distinguishing between short time intervals is decreased. This is known as the frequency/time resolution trade-off.
Furthermore, band-pass filters and the traditional DFT computation are not reliable with tones distorted with Gaussian noise, impulse noise, speech signal, tone interruptions, time shifts between frequency components and those having ambiguous transition time zones between two states. This type of interference is very common in telephone networks where the transmission lines are subject to atmospheric conditions and an uncontrollable environment.
Thus, there exists a need in the industry to provide a tone detection receiver particularly well suited to telephone networks that is capable of a high degree of flexibility, accuracy and robustness, can be adapted to the majority of current and future signalling protocols with minimal difficulty and which maintain high computational efficiency, and finally that is capable of providing a precise control over the acceptance and rejection bands for both time and frequency domain parameters of the tone in order to reduce false detection events.
A principal object of the invention is to provide an improved tone receiver, particularly well suited for use in telephone networks.
Another object of the invention is to provide an improved method for performing tone detection and recognition processes, particularly well suited in the context of a telephone network.
As embodied and broadly described herein the invention provides a tone detection apparatus, said apparatus comprising:
an input for receiving a digital signal potentially containing a tone detectable by said apparatus;
DFT computation means for computing a discrete Fourier transform coefficient for at least one candidate frequency for each sub-frame in a set of successive sub-frames of the digital signal, each sub-frame containing a plurality of signal samples, said DFT computation means computing a discrete Fourier transform for a given sub-frame of said set other than the first sub-frame of said set in a phase continuity relationship with a preceding sub-frame, said DFT computation means providing a phase offset for the given sub-frame to establish said phase continuity relationship with the preceding sub-frame;
processing means utilising said discrete Fourier transform coefficient for each sub-frame in said set to determine if a predetermined tone exists in said digital signal.
In a preferred embodiment, the tone detection apparatus performs a direct computation of the Discrete Fourier Transform (DFT) on short time sub-frames and then performs both a summing operation on complete frame and second DFT computation on segments. Both frames and segments are combinations of sub-frames. Preferably a frame is composed of two or three sub-frames while a segment is about seven sub-frames. Other size combinations are possible depending on the desired time and frequency resolution. Most preferably, the apparatus comprises:
An anti-aliasing low-pass filter to eliminate the spectral portion of the analog signal that is not in the range of interest.
An analog to digital (A/D) converter which samples the incoming signal at a rate at least equal to the Nyquist rate (8000 Hz for a telephone network) and converts it to a digital format such as PCM code.
A quadrature processing block that allows the system to substantially reduce the data stream with minimal loss of information as well as performs the first stage of spectral analysis. This processing block computes the DFT coefficients at candidate frequencies for each sub-frame of the signal.
A frame processing block which obtains magnitude information about the nominal frequencies of the incoming signal and produces a candidate tone. This is performed by summing the DFT coefficients of consecutive sub-frames computed by the quadrature-processing block.
A precision spectral processing block which obtains precise results regarding the frequency deviation of the signal with respect to nominal frequencies and permits stringent control over the accept/reject frequency bands of the receiver. This is achieved by computing a second level DFT on the basis of the first level DFTs computed by the quadrature-processing block.
A logical processing block that compares temporal characteristic of the signals with pre-determined values as well as determines based on results from the quadrature and precision spectral processing blocks, if a proper tone has been detected. This stage also permits to correctly identify tones where the two frequency components are slightly shifted in time.
A cadence processing block, part of the logical processing block, which analyses the results of the logical processing block over a slightly longer period of time (generally in terms of seconds instead of msec for the other stages). It allows determining if the correct cadence or time pattern has been detected. This block is present mainly when the tone detector is used for CPT tone detection and may be absent from certain designs without detracting from the spirit of the invention.
The tone detector may further comprise:
A buffer that accumulates the digitally sampled incoming signal into sub-frames that will be used in future processing.
A plurality of buffers, one for each analysed frequency, which accumulate the result of the first processing stage into another buffer representing a sequence of consecutive frames.
A lookup table that comprises the pre-computed sine and cosine values needed for the Discrete Fourier Transform (DFT) computation. This table is used in both the quadrature processing block and the precision spectral processing block.
In a most preferred embodiment, the tone detection apparatus operates as follows. The apparatus receives an analog signal from the system. The analog signal is first low passed filtered and then sampled in order to produce a digital signal. In the case of a telephone network application, the filtering is usually done at 4 kHz, since the frequencies of interest are in that range and therefore the sampling rate would be 8 kHz as directed by the Nyquist theorem. The digital samples are then stored in a buffer of size N that represents a sub-frame of the signal. The size of the sub-frame will determine the time resolution of the system. These samples are then transferred to the quadrature-processing block. There a DFT is computed on these sub-frames at each of the analysed candidate frequencies. A set of parameters, which are simply the values obtained by performing a DFT on a sub-frame, is generated. There is one parameter for each analysed frequency. In the case of MF signalling, six frequencies {700, 900, 1100, 1300, 1500, and 1700} must be analysed and, therefore, six parameters will be generated for each sub-frame. These parameters in turn are stored, each in a separate buffer. Once this analysis has been performed on a plurality of sub-frames and the resulting parameters have been stored in buffers, these computed values are processed by the frame processing stage and by the precision spectral processing stage.
The frame processing is executed for each of the analysed frequencies about a nominal value. The purpose of this stage is to evaluate the amplitude of the signal about the nominal frequencies and to possibly obtain a candidate tone. The analysis is done by computing for each nominal frequency the sum of the DFT coefficients of K consecutive sub-frames obtained from the quadrature-processing block where K is the number of sub-frames in a frame. Because the computations in the quadrature processing stage ensure phase continuity between consecutive sub-frames, computing the sum of the DFTs over K consecutive subframes is equivalent to computing the DFT over one frame directly from the time samples. The frame referred to in the previous sentence being composed of the same time samples as the K sub-frames. However, the computation of a sum is clearly simpler than that of a DFT and hence the DFT computed on the frame provide higher frequency resolution. The frames can also be made to overlap by including one or more of the last sub-frames composing the frame at the beginning of the following frame.
The next stage referred to as precision spectral processing is executed for each of the frequencies of the candidate tone (usually two) around a nominal value. The analysis is computed on segments that are preferably longer than frames usually composed of a few consecutive sub-frames. The purpose of this stage is to obtain power estimations for very narrow frequency bands and to determine if the frequency tolerance requirement was satisfied. Advantageously, an analysis is performed on (2L+1) local frequencies placed at regular intervals on both sides of a frequency that is closest to the nominal in a range determined by the frequency tolerance. The analysis is done by computing a weighted version of the DFT operation at each of the local frequencies and then applying magnitude and weighting operators. The weighting function is usually a time domain window such as a Hamming window centred about a frequency near the nominal. Energy values and the centre frequencies of the input signal are found at this stage which are then supplied to the logical processing stage which compares the values obtained at each of the analysed frequencies and determines if the specifications have been met for a given tone. In the affirmative, information about the tone received is sent to a controller station where the operation or connection is performed. In the event that some or all of the specifications have not been met, an error may be reported as in the case where more than two frequencies are above the energy threshold. Alternatively, the erroneous sequence may be ignored.
In a most preferred embodiment of this invention, the tone detection apparatus is integrated into a communication channel, such as one that could be used in a telephone network, that enables the accurate detection and decoding of Multi-Frequency (MF) tones. MF tones are mainly used to transmit calling and/or caller number information between telephone switches. An MF tone is detected if and only if two of the allowable frequencies are above a certain amplitude threshold and the durations are long enough to avoid any erroneous recognition. Each combination of two frequencies represents a pulse that in turn represents a digit. If more than two frequencies are present, as caused by a double key press, the receiver should report an error. Frequency and duration standards for MF are shown in tables 1 and 2.
In a typical interaction a transmitter from a calling office initiates a tone sequence by sending a special tone, called the KP tone, and then proceeds by sending a series tones representing a digit sequence. Upon termination, the transmitter sends a ST tone to indicate the end of the tone sequence. The MF receiver monitors the communication channel for the KP tone. Until its reception, it ignores all signals on the channel. Once it receives the KP tone, it monitors the channel, performs an analog to digital conversion of the signal and proceeds in the spectral and temporal analysis of the signal. The spectral analysis is performed by computing the straight DFT at the six frequencies of interest {700, 900, 1100, 1300, 1500, and 1700} over a short time sub-frame. Following this, the frame processing computes the sum of the DFT coefficients over a frame (typically two or three sub-frames) and a second DFT is performed on the results of the first DFT over a segment. This allows a very high time resolution due to the short time window of the first DFT and a high frequency resolution due to the longer time window resulting from the concatenation of many sub-frames in the second DFT. Following this computation, the results of the quadrature processing stage, the precision spectral processing stage and the frame processing stage are sent to the logical processing block which evaluates if some predetermined amplitude criteria have been met and if the temporal requirement have been attained. If a valid tone is detected, the corresponding digit is sent to a controller that performs a predetermined operation such as connection or billing. Once the receiver decodes the ST tone, it stops monitoring the communication channel for digit tone and monitors for the KP tone once again.
In another specific embodiment of this invention, the tone detection apparatus is integrated into a communication channel, such as one that could be used in a telephone network, that enables the accurate detection and decoding of Dual Tone Multi-Frequency (DTMF) tones. DTMF are mainly used for communications between the user and the system and can also be used to transfer information between telephone switches. As a concrete example of DTMF signalling, customers dialling into a service provider and requesting information may confirm what is understood by pressing the numbers on a touch tone pad. DTMF tones are similar to MF tone in the sense that they are composed of two and only two frequencies. However, DTMF tones consist of one frequency from a low group and one frequency from high group and have 16 distinct combinations. The table 3 and 4 show the frequency and time specification for DTMF tones. The spectral processing of a DTMF tone is similar to the MF except that the frequencies analysed are different and that there are no KP/ST tones.
In another specific embodiment of this invention, the tone detection apparatus is integrated into a communication channel, such as one that could be used in a telephone network, that enables the accurate detection and decoding of Call Progress Tones (CPT). Call progress tones, also called audible tones, are used to inform users and operators of the system about the progress or disposition of the telephone call they are attempting. CPT tones include the dial tone, audible ring, line busy, reorder, special service information tones (SIT), recall dial tone and many others represented as Multi-frequency signals having a complex cadence. Traditionally, standards governing the CPT tone have been very lax. Typically, we differentiate CPT tone more on the basis of their cadence than on stringent frequency and time accept/reject rules in contrast to the case for MF and DTMF tones where precise specifications are given. Tables 5 and 6 show the frequency and timing characteristics of the common CPT tones. The spectral processing of this type of signal is similar to that of the MF and DTMF signals except for the frequencies analysed and the absence of the KP/ST tones. The temporal processing however involves evaluating over a longer time period the cadence of the signal with respect to some predefined cadence.
As embodied and broadly described herein the invention also provides an improvement to a tone detection apparatus that comprises:
an input for receiving a digital signal potentially containing a tone detectable by said apparatus;
first level DFT computation means for processing the digital signal to compute a plurality of discrete Fourier transform coefficients associated to a candidate frequency, each discrete Fourier transform coefficient being associated to a respective sub-frame in a set of successive sub-frames of the digital signal;
second level DFT computation means for computing at least one discrete Fourier transform coefficient associated to the set of successive sub-frames on the basis of the plurality of discrete Fourier transform coefficients computed by said first level DFT computation means.
As embodied and broadly described herein the invention also provides a method for detecting tones in a digital signal, said method comprising the steps of:
receiving a digital signal potentially containing a tone detectable by said apparatus;
computing a discrete Fourier transform coefficient for at least one candidate frequency for each sub-frame in a set of successive sub-frames of the digital signal, each sub-frame containing a plurality of signal samples, the computation of a discrete Fourier transform for a given sub-frame of said set other than the first sub-frame of said set being effected in a phase continuity relationship with a preceding sub-frame, the computation of a discrete Fourier transform for the given sub-frame including providing a phase offset to establish said phase continuity relationship with the preceding sub-frame;
utilising said discrete Fourier transform coefficient for each sub-frame in said set to determine if a predetermined tone exists in said digital signal.
As embodied and broadly described herein the invention also provides an improvement to a method for detecting tones in a digital signal, the improvement comprising the steps of:
a) receiving a digital signal potentially containing a tone;
b) processing the digital signal to compute a plurality of discrete Fourier transform coefficients associated to a candidate frequency, each discrete Fourier transform coefficient being associated to a respective sub-frame in a set of successive sub-frames of the digital signal;
c) computing at least one discrete Fourier transform coefficient associated to the set of successive sub-frames on the basis of the plurality of discrete Fourier transform coefficients computed at step b).
As embodied and broadly described herein the invention also provides a tone detection apparatus, said apparatus comprising:
an input for receiving a digital signal potentially containing a tone detectable by said apparatus;
energy determination means for assessing a cumulative energy value indicative of a total energy in said digital signal over a certain time period at a plurality of predetermined frequencies in said digital signal, each one of said predetermined frequencies corresponding to a given tone;
processing means operative if said cumulative energy value exceeds a threshold to determine at which frequency of said plurality of frequencies a tone is present.
In a most preferred embodiment, the tone detection apparatus as defined above in broad terms features a dual-stage process for detecting the presence of tones in the signal. The first stage of the process is designed to detect if a tone is likely to exist in the signal. If the likelihood of a tone presence is significant, the second stage is invoked that performs a more detailed analysis of the signal to identify which tone is present and whether this tone is within an acceptable frequency tolerance range. During the first stage the energy of the signal is computed over a plurality of signal sub-frames in the frequency bands of interest, each frequency band corresponding to a given tone. For each sub-frame this involves computing a DFT coefficient for each frequency band. The DFT coefficients are computed in a phase-continuous fashion from one sub-frame to another, thus enabling to estimate the energy in the frequency bands of interest over a full frame (made up by a number of sub-frames) by simply adding the DFT coefficients for the different sub-frames. This addition allows obtaining a value indicative of the combined energy in the frequency bands of interest over a period of time corresponding to the duration of the frame. If the combined energy exceeds a certain threshold, that means that a tone is likely to exist in the signal, then the second, detailed analysis stage is effected. During that second stage analysis, the frequency band where a high energy level is present is identified to determine which tone is present, and also a frequency tolerance test is performed to determine if the frequency of the existing tone is within a certain acceptance range.
As embodied and broadly described herein, the invention also provides a method for detecting tones in a digital signal, said method comprising the steps of:
receiving a digital signal potentially containing a tone detectable by said apparatus;
assessing a cumulative energy value indicative of a total energy in said digital signal over a certain time period at a plurality of predetermined frequencies in said digital signal, each one of said predetermined frequencies corresponding to a given tone;
determining at which frequency of said plurality of frequencies a tone is present when said cumulative energy value exceeds a threshold.