1. Technical Field
The present invention pertains to systems facilitating secure network communications. In particular, the present invention pertains to an apparatus or system facilitating secure network communications for users accessing the network via voice responsive interfaces.
2. Discussion of the Related Art
Generally, computer systems are utilized to access and navigate through a communications network, such as the Internet. These computer systems each typically include an input device (e.g., keyboard, mouse, etc.) and network navigation software (e.g., a browser) to traverse the network and communicate with various network sites. In order to prevent unauthorized access to information transmitted over the network, secure communication techniques may be employed that typically utilize certificates and private keys to verify a user identity and encrypt transferred information. The certificate is issued to a user by a certificate authority that basically ensures the identity of the particular user receiving the certificate. The certificate includes a public key and other identification information for a user and is stored along with a private key on a user computer system. When information is requested by the user computer system from a secure network site or web server system, a secure key exchange is negotiated between the user computer system and web server system typically utilizing a public-key/private-key scheme. Basically, certificates for the user computer system and web server system are initially transferred to provide the respective systems with the other's public key for performing the key exchange. The key exchange makes use of shared data plus public and private keys to allow both participants in the connection to generate a common secure session key. The key exchange results in a session key that is used to encrypt and decrypt subsequent information transferred between the user computer system and web server system to provide a secure session. An exemplary computer protocol for this type of secure information transference is the Secure Sockets Layer (SSL) protocol.
Further, network sites may employ security measures to verify authorized users and control access to the sites and/or site information. One such technique includes utilization of account or user names and corresponding passwords to control access to a network site. This technique may be utilized for each independent site, or a single user name and password may be utilized for multiple sites. For example, the Passport protocol, available from Microsoft Corporation, permits access to multiple web sites based on a single user name and password. The protocol includes a client computer system employing a browser, a merchant server and a protocol server. The protocol server maintains authentication and profile information for a client and provides the merchant server with access to this information when permitted by the client. In operation, a client, via the client computer system, accesses a merchant site requiring client authorization. The client system is redirected by the merchant server to the protocol server where the client provides the appropriate user name and password (e.g., “logs in” to the protocol server). This interaction utilizes the Secure Sockets Layer (SSL) protocol. The protocol server redirects the client system to the merchant site and provides the client system with encrypted authentication information for that site. The authentication information is encrypted using a triple Data Encryption Standard (DES) technique having a key previously established between the merchant server and protocol server. The merchant server verifies the client based on the authentication information, and stores an encrypted file (e.g., cookie file) in the client system to enable authentication of the client by the merchant server for subsequent visits to that site (e.g., without repeating the protocol server login procedure). In addition, the protocol server similarly stores an encrypted file (e.g., cookie file) on the client system to enable authentication of the client by the protocol server for other sites (e.g., without repeating the login procedure). The Passport protocol further enables a client to provide personal and credit card information for selective transfer to multiple servers for purchasing products over the network.
In addition, voice verification may be utilized in various systems and may be implemented by varying techniques to provide appropriate security. For example, co-pending U.S. patent application Ser. No. 08/960,509, entitled “Voice Authentication System” and filed Oct. 28, 1997, discloses a speaker authentication system operable in first and second modes. The first mode facilitates enrollment of users, while the second mode verifies that a person is a particular authorized user. The system includes a user interface and a verification module. The interface facilitates communication between a user and the verification module and operates in the first mode to prompt the user to utter a first set of phrases for enrolling the user. The user interface further prompts a user seeking verification in the second mode to utter a randomized second set of phrases corresponding to the first phrase set. The verification module generates voice models corresponding to the first set of speech utterances received from the user in the first mode and compares the voice models in the second mode to the randomized second set of speech utterances to verify that the user is a particular authorized user. The system may control remote computer access or access to information on network sites based on verification of user utterances.
U.S. Pat. No. 5,339,385 (Higgins) discloses a speaker verification system that accepts or rejects the claimed identity of an individual based on an analysis of the individual's utterances. The individual is prompted to speak test phrases selected randomly and composed of words from a small vocabulary. The system determines nearest-neighbor distances between speech frames derived from the spoken test phrases and speech frames of corresponding vocabulary words from previously stored utterances of an enrolled speaker. In addition, distances between the spoken test phrases and corresponding vocabulary words for a set of reference speakers are determined by the system. The claimed identification is accepted or rejected based on the relationship of the determined distances to a predetermined threshold.
U.S. Pat. No. 5,414,755 (Bahler et al) discloses a method for passive voice verification in a telephone network. A telephone long distance service is provided using speaker verification to determine the validity of a user. The user claims an identity by providing an identification, typically a calling card number, to a telephone. Unrestricted, extemporaneous speech of a group of customers are digitized, analyzed and characterized as a non-parametric set of speech feature vectors. The extemporaneous speech of the user is digitized and analyzed in a similar manner. The user identity is verified by comparing a reference utterance of a known customer with utterances from one or more unknown users, one of which is the user claiming the identity of a known customer. The comparison results in a decision to accept or reject the claimed identity, where the identity to be tested is derived from the calling card number.
U.S. Pat. No. 5,806,040 (Vensko) discloses a speech controlled verification system for verifying the identity of a person using a telephone calling card, bank card or other credit card. The system connects the person to a telephone network to enter the card number. The card number is utilized to access a central database and retrieve a voice verification template corresponding to the entered card number. The system prompts the user to state one of the words, phrases and/or numbers contained in the retrieved voice verification template, and compares the stated words to the template. If the stated words match the template, the user is considered to be an authorized user and the card is validated.
U.S. Pat. No. 5,937,781 (Huang et al) discloses a voice verification system for telephone transactions. The system includes a mechanism to prompt the user to speak in a limited vocabulary, and a feature extractor that converts the limited vocabulary into a plurality of speech frames. A pre-processor is coupled to the feature extractor for processing the speech frames to produce a plurality of processed speech frames, while a frame label is assigned to each speech frame via a Viterbi decoder. The processed frames and frame labels are combined to produce a voice model that is compared to an authorized user voice model derived during a previous enrollment session. The user voice model is further compared with an alternative voice model set derived during previous enrollment sessions. The claimed identity is accepted when the user voice model more closely resembles the authorized user voice model than the alternative voice model set.
Voice technology may further be employed by network browsers to interact with users and enhance accessibility to networks. With respect to the Internet, voice responsive browsers permit users to call an Internet Service Provider (ISP) via telephone and navigate the Internet by voice commands. Web pages retrieved by a voice responsive browser generally include extended definitions to enable the voice responsive browser to process those pages. The definitions provide the web page audio to synthesize for transmission to a caller and the appropriate speech to receive from the caller in response to a retrieved web page. Thus, the voice responsive browser basically provides audio to a caller to describe actions for a web page and performs commanded actions in response to appropriate voice commands from the caller with respect to that web page (e.g., relating to web page buttons or other selections). An example of a voice browser is disclosed in U.S. Pat. No. 5,915,001 (Uppaluru).
The related art suffers from several disadvantages. In particular, when a telephone or similar device is utilized by a user to access a network via a voice responsive interface, the user does not have a computer system or memory for storing security information, such as a certificate and/or private key. This precludes use of the above-described techniques for secure network communications and restricts the network activities and navigational capabilities of the user. Although security information may be stored remotely, this exposes the security information to an increased risk of misappropriation, thereby allowing unauthorized users to improperly obtain security privileges to secure network sites and information.