Telephone toll-fraud, particularly fraud initiated via security violations of customer premises equipment, has become a significant expense to corporations throughout the world. For example, private branch exchange ("PBX") switches are often a target for thieves who steal telephone services because a PBX usually supports a "direct inward system access" (DISA) feature. In such a system, authorized personnel can call into the PBX, log in using an access code, and execute various PBX functions, including dialing long-distance calls. Thieves usually obtain a PBX's DISA dial number and access code through direct theft, misrepresentation, or using computer techniques.
Another common fraud activity is the unauthorized use of telephone calling cards. Thieves can illegally obtain a user's calling card number in a variety of ways, including, e.g., watching a user making a call and noting the numbers dialed. Thereafter, the thief can make long distance calls using the stolen number. It can take days or even weeks to detect such fraudulent use of a card, during which time a considerable number of illegal calls can be made.
Recent advances in voice signal processing and automatic speaker recognition systems have been used to address telephone fraud. "Voice" is defined herein as any audible utterance by a human. Typical automatic speaker recognition systems operate in one of two modes, referred to as "verification mode" and "identification mode" in A. E. Rosenberg and F. K. Soong, "Recent Advances in Automatic Speaker Recognition", Advances in Speech Signal Processing, pp. 701-738 (Marcel Dekker, Inc., 1992) (incorporated herein by reference).
In the "verification mode" of an automatic speaker recognition system, an identity claim is presented to the system along with a voice sample from an unknown speaker. A speaker verification decision results from comparing the voice sample with a stored voice model for the speaker whose identity is claimed and determining whether the difference is within a given threshold value. If the difference is within the threshold, then the identity claim is confirmed and the speaker may be given access to a restricted telephone network.
For a speaker recognition system operating in the "identification mode" a voice sample from an unknown speaker is compared with voice models of a population of known speakers. A speaker identification decision indicates whether the voice sample matches one of the models of known speakers to within a given threshold value.
Systems for detecting and controlling telephone toll-fraud can be classified into three categories: aggressive, reactive, and passive. Aggressive systems are designed to control access to the telephone line by requiring the prospective user to present some identity claim information, such as a password, to be used for authentication before the user is granted access to the system. Such a system prompts the user to utter a spoken "password" compares it to authorized voice profiles, and grants or refuses access accordingly. This is a "text-dependent" system since it analyzes a specific word (i.e., text). Examples of such systems are "VoiceLock" from Moscom of Pittsford, N.Y. and "VVF" from Wye of Annapolis, Md. Less advanced aggressive systems use touch-tone entered passwords.
Reactive systems monitor protected telephone lines or switch use and keep a count of specified events such as the number of off-hour calls, international calls, very short calls, or very long calls. The goal is to issue warnings or alarms when appropriate threshold values are exceeded. Companies selling such reactive systems include Western Telematic of Irvine, California, XTEND of New York, N.Y., and Infotext of Schaumburg, Ill.
Finally, passive systems are those which discover toll-fraud through postmortem analysis of call accounting records.