"Computer...." This simple voice command long the exclusive domain of science fiction for initiating an exchange between a person and a computer is fast becoming the preferred method of accessing a computer. Speech recognition software is now readily available to run on personal computers (PCs) in real time. These software programs allow the PC user to manipulate the PC, open software programs, perform functions, dictate letters and other documents and perform any number of tasks without keyboards, mice or other pointing/selecting devices. Instead, the software literally responds to the user's beck and call to perform tasks by recognizing spoken words.
These software programs have seen and will see continued application on PCs. Already users have the ability to call into their computer from remote locations and access documents, files and data. Soon the remote user will be able to manipulate documents, files and data or cause the computer to communicate them to other locations via electronic transfer or facsimile transmission through oral commands given over a telephone, data link or other type of connection.
Speech recognition is also fast entering the commercial domain. In phonetics, speech is what is spoken by people. It is composed of "voiced" and "unvoiced" sounds (voiced sounds using at least in part the vocal cords). Many computerized telephone answering systems recognize spoken words for answering and transferring incoming calls. Soon a bank depositor will have the ability to call into his bank and access his account through spoken commands. In other applications, consumers may manage investments, make purchases, and perform any number of transactions, which once required a keyboard and/or numeric pad for data entry using speech recognition.
The notion of recognizing and authorizing a transaction is known for hardware applications. For example, in a mobile radiotelephone communication system, the communication system queries and authenticates the hardware, i.e., the mobile station, as it attempts to access the network. Similarly, a peripheral device, such as a printer, terminal, modem or the like, is recognized by a device name and potentially a password when connected to a computer network.
Speech recognition, however, and particularly as its capabilities are enhanced and adapted for facilitating financial transactions raises serious security concerns. The user will access the system using only spoken commands, frequently from unsecure connections. The system must not only be capable of recognizing the user by his speech, but it must also be capable of authenticating the user from the received speech sample before transactions can be approved. In addition, the system must assess the capability of the connection and its security potential for enhanced security. The recognition and authentication problem is exacerbated because the source of the communication may be bandwidth limited, noisy or otherwise unsuitable for proper speech recognition. That is, the speaker may not be in a quiet environment speaking into a calibrated microphone securely linked to the system being accessed. Instead, a user may be calling from remote locations via radio, unsecure telephone networks or over the Internet.
User names, passwords and personal identification numbers (PINs) provide a level of fairly effective security. However, it is well known that secret passwords may be intercepted and used by unscrupulous individuals to access the user's accounts. It is known that, similar to fingerprints, individuals have unique speech prints, and that these speech prints may be used to positively identify a person. Speech identification technology is available today; however, it is generally limited to applications where a very clean sample of the user's speech is available. Where a user is accessing the computer system via a telephone network, the speech sample may not be of sufficient quality to effect authentication with sufficient confidence. Hence, there is a need for a system and method for securing transactions conducted by spoken commands which accounts for the access media. In addition, there is a need to enhance speech recognition technology by adapting the speech recognition system to the equipment and media employed to access the speech recognition system.