The present invention relates to methods and systems that provide suppression of near-end speech energy for applications including, but not limited to, improving the talkoff and talkdown performance of inband signal tone detection systems. In particular, the present invention describes a method and system providing interconnection between the tip and ring telephone line interface and subsequent communications equipment for the purposes of calibrating a selectable line bridging circuit and extracting a single, unidirectional path containing predominantly far end energy, wherein, near-end speech signals have been canceled. The method and system inherently provide access to on-hook service signals, such as calling party identification data transmissions.
Echo cancellation systems are widely used in the telephone network and in station set equipment. The traditional role of echo cancellation systems in the telephone network has been to improve the quality of a transmission channel by removing unwanted signal reflections that occur at points of impedance mismatch in the communication circuit. Echo cancellers have also been employed in station set equipment, for the most part, to enable high speed, full duplex data transmission. With the introduction of new telephone services aimed at the analog residential subscriber, echo cancellers or near-end speech cancellation systems have recently become of significant importance in subscriber station sets to improve the performance of inband tone signal detectors.
Inband tone signaling schemes using combinations of discrete frequencies have long been used in the telephone system. The primary advantage of inband tone signaling is that the same spectrum that normally carries customer speech can be used to alternately transmit signal and control information. Sharing the voiceband is essential in situations where bandwidth is limited and dedicated control channels are either too costly or pose a degradation to service. Some of the most common examples of inband tone signaling used in the telephone network today include call progress signals, such as dial tone, stutter dial tone, audible ringing, busy, reorder, call waiting, etc., and Dual Tone Multi-Frequency (DTMF) signals used predominantly for dialing.
In recent years, new telephone services, such as Calling Identity Delivery on Call Waiting (CIDCW), Call Waiting Deluxe (CWD) and advanced screen telephony platforms, such as the Analog Display Services Interface (ADSI) and the Internet or Web Phone, have been deployed and require reliable Customer Premises Equipment (CPE) tone signal detection for signals sent by a Stored Program Control Switching System (SPCSS) or a far-end server. These services and platforms, encouraged by many technological advances in semiconductors, are transforming the conventional telephone set into a sophisticated, integrated communications terminal bearing a liquid crystal display and keyboard that under microprocessor, if not digital signal processor, control can track the state of a call and react to network and far-end tone signals.
All inband tone signaling systems are premised on the belief that a tone signal can be reliably detected. For Analog Display Services Interface (ADSI) Customer Premises Equipment (CPE), reliable detection of network call progress signals is necessary for the CPE to properly track the state of the call and generate internal events that are to be processed by a downloadable service script resident in the CPE. For CIDCW and CWD CPEs, reliable detection of the CPE Alerting Signal (CAS) is necessary to engage the CPE""s off-hook data transmission mode for the reception of a data burst containing the calling party""s number, name, location or personal identification number. For telephone answering machines and voicemail systems, reliable detection of DTMF signals is necessary to allow the subscriber to specify editing and control actions, even during playback of voice messages.
While reuse of an inband channel provides an efficient means for network-to-station set or server-to-station set signaling, significant problems related to signal recognition may be encountered by station sets attempting to detect tone signals.
Two traditional problems with inband tone signal detection are detector talkoff and talkdown.
Talkoff occurs whenever a tone signal detector erroneously accepts signal imitations produced by speech, music or noise as valid tone signals. Studies, experimentation, and field experience have all decisively confirmed that human speech can imitate some of the spectral and temporal properties of tone signals. The combination of consonants, vowels, syllables, and accent that frequently occur in an ordinary telephone conversation can cause a tone signal detector to talkoff. Ever since the first use of inband tone signaling in the telephone network, it has been a challenge designing reliable tone signal detection systems that are non-responsive to signal imitations.
Talkdown is another significant performance characteristic of tone signal detectors. Talkdown occurs whenever a tone signal detector fails to recognize a valid tone signal because it was masked or denied validation as a tone signal because of extraneous energy present on the line. In some instances, tone signals may compete with speech, music and other background noise. The presence of these complex signals distorts valid tone signals and can impair their detection.
Talkoff and talkdown are two critical performance measures for a tone signal detector. They respectively describe the detector""s ability to resist signal imitations and to recognize valid tone signals obscured by speech, music or noise. Although tone signal detection has been a prevalent art in the telephone network for decades, only recently has the need for robust talkoff and talkdown performance been simultaneously required in an application. For the most part, prior art tone signaling applications, such as DTMF dialing, have benefited from environments where detector talkdown performance could be sacrificed in favor of improving talkoff performance. With the advent of CIDCW, CWD and ADSI, simultaneous robust talkoff and talkdown performance became a necessity.
Bellcore has specified CPE or station set criteria in Bellcore documents SR-TSV-002476, entitled xe2x80x9cCustomer Premises Equipment Compatibility Considerations for the Voiceband Data Transmission Interfacexe2x80x9d, Issue 1, December 1992, and SR-3004, entitled xe2x80x9cTesting Guidelines for Analog Type 1, 2, and 3 CPE Described in SR-INS-002726xe2x80x9d, January 1995, that address the talkoff and talkdown performance of tone signal detectors for the CAS and call progress signals. The recommendations contained in these documents call for highly reliable tone signal detection. For example, SR-TSV-002476 recommends that a CAS detector respond to no more than 1 signal imitation in 45 hours of exposure to equal amounts of average level near-end and far-end telephone speech. The talkdown criteria that must be simultaneously achieved by this CAS tone signal detector for the average near-end talker on an average loop are the recognition of 99% of all valid CAS. The combination of these performance criteria makes CAS tone signal detectors that are compliant with SR-TSV-002476 arguably the most robust inband tone signal detectors ever deployed in the telephone network.
For tone signal detection systems used at a subscriber""s location, signal imitations can come from both the near-end subscriber""s voice as well as the voice of a far-end party. The near-end subscriber""s voice is usually the dominant source of talkoff because the electrical speech level of the near-end subscriber is significantly stronger than that of the far-end. The speech signal of the far-end party is reduced by the loss on two loops, i.e., the far-end party""s loop and the near-end subscriber""s loop, and any intervening network loss before it appears at the near-end subscriber""s station set. The near-end subscriber is also the dominant cause of talkdown since signals like the CAS and call progress signals are typically transmitted from the central office SPCSS while the far-end party is either muted or not yet connected.
It is characteristic of tone signal detectors to employ the concept of guard action to resist tone signal imitations and gain a degree of immunity to talkoff. Such detectors validate a tone signal only if a certain signal-to-guard ratio is satisfied for each tone signal frequency component. The signal-to-guard ratio is the ratio of the power present within a tone signal frequency band to the power present in one or several designated guard bands. The guard band is a portion of the voiceband that the tone signal detector uses to extract information about the purity of the tone signal. A single guard band can be selected for all the tone signal frequency components or a combination of several guard bands may be used.
Detectors using the guard principle usually require a large positive signal-to-guard ratio to validate incoming tone signals to minimize talkoff. A large signal-to-guard ratio demands that the energy within the signaling frequency band be relatively pure with respect to the energy in the guard band(s). Since speech is likely to produce significant energy at frequencies outside the signaling bands, this condition rejects many potential energy patterns that might talkoff a detector and, hence, improves tone signal detector talkoff performance.
Although this strategy may provide good talkoff performance, talkdown performance is likely to suffer unless speech, music or noise that can mix with a tone signal is successfully attenuated or canceled. Two basic approaches have been employed by the majority of new CIDCW, CWD, and ADSI CPE to provide satisfactory tone signal detector performance. The simplest approach has been the direct, parallel connection of the tone signal detector to tip and ring interface. Better arrangements have placed the tone signal detector behind a speech path separation device that inherently attenuates the level of near-end speech. More complex arrangements have utilized analog and digital cancellation techniques. A closer examination of several existing prior art implementations that fall within these two categories reveals their advantages, disadvantages, and the benefits of the present invention.
In the simplest approach, the tone signal detector is bridged directly across the tip and ring interface of the station set as illustrated in FIG. 1. This arrangement is advantageous primarily because of its minimal line interconnection complexity. The tone signal detector passively listens across the line. Its high impedance and parallel line.connection mean that it does not interfere with other station sets on the same line or communication equipment beyond its point of presence. It further provides access to on-hook-service signals, such as Calling Identity Delivery (CID). Its interconnection method is also very amenable to adjunct communication devices that do not incorporate any type of line termination circuit that may normally be used in an integrated telephone.
The primary disadvantage of the bridged tip and ring arrangement is that it presents the worst case tone signal detection environment. The tone signal detector in this arrangement is exposed to the full power of near-end speech. This creates significant difficulties for achieving robust talkoff and talkdown performance. A survey of speech levels, adjusted and converted to obtain levels at the station set, indicates that near-end telephone speech has a mean Active Speech Level (ASL) of xe2x88x9219 dBm with a Gaussian distribution and standard deviation of approximately 4 dB. Using the three sigma case as the upper limit, near-end speech levels at the subscriber""s tip and ring interface can reach levels as high as xe2x88x927 dBm ASL. Experimentation and experience have decisively shown that the talkoff and talkdown performance of a tone signal detector rapidly degrades as the level of speech increases. The rate of talkoff, or number of talkoffs per hour, tends to rise exponentially with increasing speech level. Speech levels at xe2x88x927 dBm ASL are extremely loud and usually pose a substantial threat for talkoff and talkdown. Although possessing low interconnection complexity, the bridged tip and ring arrangement offers no benefit in reducing the level of near-end speech.
Near-end speech poses a even greater threat for CAS tone signal detectors. Not only are near-end speech levels loud, but the threat of talkoff is further enhanced because near-end speech is likely to be pre-emphasized by the subscriber""s telephone handset. Historically, the transmitter response of the handset provides gain in the upper voiceband to counteract the effect of loop loss. Although most of the speech energy is in the lower part of the voiceband ( less than 1000 Hz), psychological studies have determined that energy in the upper voiceband is necessary and critical to maintain the intelligibility of speech. As a result, telephone transmitters have been historically designed to supply an energy boost in the upper voiceband. A survey of commercially available telephone equipment indicates that an average transmitter characteristic can be approximated by a straight line with positive slope from 300 Hz to 3000 Hz over a log-frequency scale, with a response at 300 Hz down 5 dB relative to 1000 Hz and a response at 3000 Hz 5 dB higher relative to 1000 Hz. Since CAS frequencies, 2130 and 2750 Hz, are in the upper voiceband, transmitter pre-emphasis will place more speech energy in the signaling bands and create even more potential for talkoff that is not mitigated by the bridged tip and ring arrangement.
Tone signal detector talkdown is also a problem for the bridged tip and ring arrangement because near-end speech energy will often overwhelm the tone signal energy. In the case of CIDCW, for instance, the CAS is typically sent from the SPCSS at xe2x88x9215 dBm per tone. Attenuation due to the loop response can introduce up to 15 dB of loss in the 99 percentile case. Since near-end speech can combine with CAS, tip and ring CAS tone signal detectors will be exposed to a worst-case signal-to-speech ratio of xe2x88x9223 dB (xe2x88x9215xe2x88x92(xe2x88x927) dB). Reliable detection of tone signals with such a poor signal-to-noise ratio is difficult, even for liberal detectors that make little attempt to reject signal imitations. With a tone signal detector employing the aforementioned guard principle, the- signal-to-guard ratio qualification criteria would not be met in many instances of legitimate tone signals because near-end speech energy would significantly corrupt the signal.
As taught in Battista, et. al., U.S. Pat. No. 5,519,774 entitled, xe2x80x9cMethod and System for Detecting at a Selected Station an Alerting Signal in the Presence of Speechxe2x80x9d tone signal detectors can be designed to provide good talkoff and talkdown performance for bridged tip and ring applications. However, the meticulous adjustment of detection parameters that is necessary to achieve the proper balance of talkoff and talkdown performance in these designs is a difficult and time consuming process. Furthermore, there is no guarantee that the final detector design will be conducive to a specific manufacturing process.
In summary, the bridged tip and ring tone signal detector arrangement is a simple, non-intrusive method to access service signals, such as inband tone signals and on-hook CID data transmission signals. However, from the standpoint of tone signal detection, it is the most difficult arrangement to achieve good talkoff and talkdown performance because it does nothing to reduce the level of near-end speech incident upon a tone signal detector. The prior art has already established that tip and ring tone signal detectors with good talkoff and talkdown performance, while achievable, are extremely difficult to design and build.
A second common arrangement employed in conjunction with tone signal detectors that provides improved talkoff and talkdown performance without modifications to a tone signal detector""s algorithm is illustrated in FIG. 2. In this system, the tone signal detector is located behind a device typically referred to as a hybrid.
The hybrid is a device that converts the bi-directional path on the tip and ring interface into two separate unidirectional paths for transmit and receive. Far-end and network signals on the tip and ring interface appear on the receive path where the tone signal detector is connected. Near-end signals are ideally transferred from the transmit path behind the hybrid to the tip and ring interface.
In practice, some leakage of near-end speech energy will occur across the hybrid and appear at the input to the tone signal detector. The amount by which the near-end energy at a given frequency is attenuated by the hybrid is known as the transhybrid loss. The transhybrid loss is a function of how well the impedance of the balance network matches the impedance presented by the tip and ring interface.
The amount of transhybrid loss is critical to the performance of the tone signal detector in this arrangement because the transhybrid loss effects a reduction in the level of near-end speech incident upon the tone signal detector. Attenuation of the near-end speech level is useful because it dually reduces the probability of a talkoff occurrence and the probability that near-end speech will corrupt an incoming CAS. With a 6 dB transhybrid loss, for example, the level of near-end speech appearing at the tone signal detector input will be reduced from xe2x88x927 to xe2x88x9213 dBm ASL and the signal-to-speech ratio will improve from xe2x88x9223 to xe2x88x9217 dB over the bridged tip and ring arrangement. Experimentation and experience have demonstrated that a reduction of 3 dB in near-end speech level or a similar improvement in signal-to-speech ratio dramatically improves the talkoff and talkdown performance of a tone signal detector similar to that described in Battista, et. al. Furthermore, a key design benefit of the hybrid arrangement is that it makes balancing the tradeoff between talkoff and talkdown performance less difficult because the dynamic swing of the tone signal detector, which is defined as the difference in dB between the worst case speech level and the worst case tone level, has been reduced.
Because transhybrid loss rapidly decreases as the match between the line impedance and the balance network diverges, a single network may not provide a suitable degree of transhybrid loss across the large majority of loop conditions. With a single balance network, for instance, the worst case transhybrid loss can range from 2 to 6 dB over the domain of all loop impedances in the U.S. network. To obtain further reduction in near-end speech level and improve the signal-to-speech ratio, the single balance network may be replaced by multiple, fixed networks or an adjustable network as illustrated in FIG. 3. This arrangement is sometimes referred to as an analog echo canceller.
Multiple balance networks or an adjustable balance network provide significant improvement in transhybrid loss over a signal network system. Transhybrid losses of greater than 15 dB could usually be achieved using at least three fixed networks. Because more than one balance network is available, the architecture must also include a mechanism (not shown) to select the optimal network for the loop condition encountered.
Although favorable from the standpoint of tone signal detector performance, arrangements like those illustrated in FIGS. 2 and 3 have certain disadvantages. First, traditional hybrid architectures are well suited for integrated telephone applications where separation of the speech path is inherently needed to provide the handset receiver and transmitter functions. For devices like telephone adjuncts, these systems are less practical. Adjunct devices are usually electrically connected in series with a station set and must therefore be capable of passing basic telephone line attributes such as DC voltage, line current, AC signals, and power ringing. To that extent, it is common practice to employ the bridged tip and ring solution previously described because the tip and ring interface physically passes through the adjunct unimpeded. To adapt a hybrid arrangement like those in FIGS. 2 and 3 for an adjunct, two hybrids must be placed back-to-back so that the two wire interface is regenerated for connection to a subscriber""s telephone set. Additional circuitry is needed to either regenerate DC line voltage and power ringing or provide a means to route such signals around the back-to-back hybrids arrangement. This arrangement then becomes similar to a network repeater circuit where transmission characteristics of the repeater that affect the quality of the voice channel and factors like closed loop gain must be carefully engineered to avoid unstable device operation and provide a transparent line interface. For these reasons, the traditional hybrid solution useful in integrated telephone sets is not very practical for low cost adjuncts.
Another important consideration for the hybrid systems in FIGS. 2 and 3 is the provisioning of sidetone in integrated station sets. Traditionally, a certain amount of transhybrid leakage was intentionally designed into telephone sets to allow users to hear an attenuated version of their own speech. Psychologically, this provides the subscriber with the impression that the station set is operational. As a result, transhybrid losses were adjusted to provide no more than 6 dB of loss to satisfy the human factors requirements for sidetone. For tone signal detector performance and system design, this presents a disadvantage. In order to increase the transhybrid loss of the arrangements in FIGS. 2 and 3, a secondary circuit is needed to provide an alternate path for sidetone.
There is a third disadvantage to the arrangements in FIGS. 2 and 3, especially for integrated station set applications. There are instances when the functional elements of the station set may need access to the AC signals on the tip and ring interface even though the station set is in the on-hook condition. Two such identifiable instances include support for Multiple Extension Interworking (MEI) and on-hook services such as CID.
MEI is a signaling method and protocol for communication among CPEs on a subscriber""s line that enables three functions: 1) the reception of CIDCW by all compatible CPE, regardless of their individual hook state; 2) the generation of customer line signals, such as Flash, to indicate selection of a call control action; and 3) the management of CAS acknowledgment signaling interactions among multiple CIDCW, CWD, and ADSI CPE. In order to perform the MEI protocol, a CPE must be able to detect a CAS while it is on-hook. With the hybrid systems depicted in FIGS. 2 and 3, the hybrid function is generally disconnected from the line interface by the hook switch function when the subscriber""s set is in the on-hook condition. Consequently, the tone signal detector, being on the receive side of the hybrid, will lose access to the tone signals on the tip and ring interface. To overcome this limitation, even further additional circuitry is required to provide an alternate signal path to the tip and ring interface while the CPE is on-hook.
Another similar disadvantage that is readily identifiable in the arrangements depicted in FIGS. 2 and 3 is the difficulty of supporting on-hook services such as CID. On-hook CID services, like Calling Number Delivery (CND), Calling Name Delivery (CNAM) and Visual Message Waiting Indicator (VMWI), deliver data using the same Frequency Shift Keying (FSK) modulation technique as off-hook CIDCW and CWD services. The desire for modular CID functional elements that perform all the necessary procedures of both the on-hook and off-hook data transmission protocols in Bellcore""s document GR-30-CORE, xe2x80x9cVoiceband Data Transmission Interfacexe2x80x9d Issue 1, December 1994, has led to the fabrication of Application Specific Integrated Circuits (ASICs), herein referred to as CID ASICs. These devices combine the FSK demodulation and CAS tone signal detection functions onto a single device. For reasons that include providing universal applicability to adjuncts and integrated sets alike, minimizing complexity and device pin count reduction, a single device input on CID ASICs must be shared for both on-hook and off-hook CID services. With the hybrid arrangements illustrated in FIGS. 2 and 3, the reduction in circuit complexity offered by CID ASICs is partially offset by the need for external circuitry and control that provides multiple signal paths to access to the tip and ring interface depending upon the hook condition of the CPE. It is a highly desired feature for a CID ASIC to allow the device to be inserted into any design without impacting or requiring specific circuitry, or imposing performance criteria on other aspects of the system architecture.
A third arrangement that also builds upon the systems depicted in FIGS. 2 and 3, yet provides significant improvements in the cancellation of near-end speech is shown in FIG. 4. In combination with a hybrid, a digital echo canceller can be employed to increase the transhybrid loss to 25 dB or more. The primary benefit of a digital echo canceller is that it practically eliminates any chance of near-end talkoff and talkdown because it highly attenuates the near-end speech echo.
In addition to those cited for the hybrid systems in FIGS. 2 and 3, the prime disadvantage of this speech cancellation system is the significant resources and interface circuitry required. Typical implementations of digital echo cancellers require an optimized microprocessor to perform the mathematical operations that remove the near-end echo, interface circuitry to digitize analog signals and memory code storage support. If the tone signal detector is implemented external to the echo canceller as illustrated in FIG. 4, an additional digital-to-analog converter is necessary. For these reasons, digital echo canceller implementations have not yet become practical for low cost adjunct and integrated telephones.
A fourth arrangement that has been attempted to cancel near-end speech using a scaled Wheatstone bridge circuit is illustrated in FIG. 5. In U.S. Pat. No. 5,796,810, filed Oct. 10, 1995, issued Aug. 18, 1998, and entitled xe2x80x9cApparatus For Dialing Of Called ID Block Code and Receiving Call Waiting Caller-ID-Signalxe2x80x9d, Lim, et. al., disclose a Wheatstone bridge circuit as illustrated in FIG. 5. This arrangement employs the Wheatstone bridge principle where if the balance network identically matches the impedance of the loop and fixed resistors Ra and Rb are identical, the near-end speech signals arriving at the input to the differential amplifier G from the two circuit legs will be identical in magnitude and phase. The differential amplifier will subtract these signals from each other and produce a resultant signal that is input to the tone signal detector containing the residual energy of the near-end speech cancellation process. In practice, resistance Rb is scaled to a factor C greater than resistance Ra to reduce loading effects on the tip and ring interface. Likewise, the single balance network impedance is scaled by the same factor.
Although this arrangement cancels near-end speech and provides access to the tip and ring interface regardless of station set hook state, it performs poorly in practice over the domain of loop impedances. The reason for its poor performance is two-fold. First, the fixed impedances Ra and Rb are subject to component tolerances and consequently are never identically matched. This results in an imbalance in the bridge that is amplified by the differential amplifier. Second, the single, fixed balance network employed in the circuit provides a poor match over the domain of possible loop impedances. Experimentation has demonstrated that the worst case near-end speech cancellation performance of the Wheatstone bridge arrangement in FIG. 5 is about 1 to 2 dB. Because of its inadequate performance, the Wheatstone bridge arrangement has often been ignored.
The review of the prior art has established that talkoff and talkdown performance of a tone signal detector can be significantly improved by attenuating the level of incident near-end speech. It has further established that most prior art near-end speech cancellation techniques require system architectures that remove the tone signal detector from the tip and ring interface and place it at a location that does not generally have access to line signals when the station set is on-hook without additional signal paths. One prior art cancellation method does provide access to tip and ring regardless of hook-state; however, its cancellation performance is poor.
In view of the foregoing, it is an object of the present invention to provide a method and system to cancel near-end speech energy for tone signal detectors that connect to the tip and ring interface using an improved Wheatstone bridge technique that also provides access to on-hook service signals regardless of the hook state of subsequent communications equipment. The method and system operate independently of other telephony functions and can be applied in standalone adjunct devices as well as integrated into a telephone set. The degree of near-end speech cancellation is controllable by scaling the system implementation to achieve the desired amount of near-end speech attenuation.
Specifically, the system uses a voltage or current sensing element placed in series with either the tip or ring interface lead. A scaled mirror impedance of both the sensing element and impedance presented by the tip and ring interface is then placed across the tip and ring interface to form a Wheatstone bridge. Rather than create two bidirectional paths, only a single receive path is differentially extracted from center of the bridge for input to a tone signal detector. Attenuation of near-end speech energy is controlled by the calibration and selection of the scaled mirror impedance values which are available from either a fixed set of R, L and C networks or an adjustable network. A controlling function uses one of several methods described to select the best network either at the time that the device is connected to the line, at the start of every telephone call or continuously adapting throughout the duration of a call.