The present invention is directed to communications systems and, more particularly, to methods and apparatus for providing voice dialing services.
Telephones, both mobile and land based, are a frequently used communications tool of modern society. While basic telephone service has remained generally unchanged in terms of its features for years, there is an ever increasing demand for new telephone services.
The demand for new telephone services is prompted by a desire to render telephones easier to use and/or desires to make them more efficient communication tools. The demand for new telephone services is also fueled by the desire of individual telephone companies to; distinguish the services they offer from those of their competitors; create new revenue sources; and/or expand existing revenue sources.
In order to provide enhanced telephone services, many telephone companies now implement a telephone communications network as an Advanced Intelligent Network (AIN) which has made it easier to provide a wide array of previously unavailable voice grade telephone service features. In an AIN system, telephone central offices, each of which serves as a signal switching point (SSP), detect one of a number of call processing events identified as AIN xe2x80x9ctriggersxe2x80x9d. An SSP which detects a trigger suspends processing of the call which activated the trigger, compiles a call data message and forwards that message via a common channel interoffice signaling (CCIS) link to a database system, such as a Service Control Point (SCP). The SCP may be implemented as part of an integrated service control point (ISCP). If needed, the SCP can instruct the central office (SSP) at which the AIN trigger was activated to obtain and forward additional information, e.g., information relating to the call. Once sufficient information about the call has reached the ISCP, the ISCP accesses stored call processing information or records (CPRs) to generate from the received message data, a call control message. The call control message is then used to instruct the central office on how to process the call which activated the AIN trigger. As part of the call control message, an ISCP can instruct the central office to send the call to an outside resource, such as an intelligent peripheral (IP) using a send to outside resource (STOR) instruction. IPs are frequently coupled to SSPs to provide message announcement capabilities, voice recognition capabilities and other functionality which is not normally provided by the central office. The control message is normally communicated from the ISCP to the SSP handling the call via the CCIS link. Once received, the SCP completes the call in accordance with the instructions received in the control message.
One service which can be implemented with AIN functionality is Wide Area Centrex. Centrex takes a group of normal telephone lines and provides call processing to add business features to the otherwise standard telephone lines. For example, Centrex adds intercom capabilities to the lines of a specified business group so that a business customer can dial other stations within the same group, e.g., lines belong to the same company, using extension numbers such as a two, three, or four digit numbers, instead of the full telephone number associated with each called line. Other examples of Centrex service features include call transfer between users at different stations of a business group and a number of varieties of call forwarding. Thus, Centrex adds a bundle of business features on top of standard telephone line features without requiring special equipment, e.g., a private branch exchange (PBX) at the customer""s premises. U.S. Pat. No. 5,247,571, which is hereby expressly incorporated by reference, describes in detail a Wide Area Centrex system implemented using AIN techniques.
Voice dialing is a useful service which has been implemented in some known systems by having control logic in a telephone switch connect a caller to a voice dialing IP which provides the voice dialing service. The telephone switch may couple the subscriber to the voice dialing IP as a result of the subscriber calling a telephone number corresponding to the IP or entering a code which is detected by the switch. Such known voice dialing systems are not AIN based and therefore are somewhat limited in terms of the logic and information available for controlling connections to voice dialing apparatus, e.g., voice dialing IPs. Thus, the known techniques of using logic embedded in a switch to determine when and to which voice dialing IP a caller should be connected can lead to inefficient using of voice dialing IP resources and limit the ability of known voice dialing services to be implemented as an integral part of other services, e.g., Centrex Services.
Voice dialing is a particularly desirable service since it eliminates the requirement that a user of the voice dialing service remember the telephone number of the party being called. In various known voice dialing systems, speaker dependent speech recognition is used to identify spoken names. In such systems, a personal dialing directory is maintained for each subscriber of the voice dialing system. The personal dialing directory is a database which includes a speaker dependent speech recognition template for each of a plurality of names which may be spoken and a telephone number for each name. When used, the data, e.g., templates in a subscriber""s directory, are retrieved; a speaker dependent speech recognition operation is performed on a spoken name provided by the subscriber; and then, assuming a name is recognized, the call is completed to the telephone number in the subscriber""s personal dialing directory corresponding to the name. An. IP may be used for performing the speech recognition and various other tasks associated with the known voice dialing operation. One particular system for implementing voice dialing is described in detail in U.S. Pat. No. 5,832,063.
While voice dialing systems which use speaker dependent speech recognition to identify spoken names enjoy a high degree of recognition accuracy, they have the disadvantage of requiring that a user of the system provide one or more utterances of each name for which speaker dependent speech recognition templates are to be generated. Thus, a voice signal, e.g., voice telephone connection, is normally required when adding or updating names in a personal dialing directory. In addition, the need to provide multiple utterances of each name in the personal dialing directory can prove irritating to some customers.
In an attempt to make services provided using AIN techniques easier to manage, management of AIN services such as, e.g., call forwarding, via a personal computer and the Internet have been suggested. U.S. Pat. No. 5,958,016, which is hereby expressly incorporated by reference, describes a system wherein a web page type interface is provided, which allows a subscriber access to control and reporting functionalities of an AIN system via the Internet.
Unfortunately, the use of speaker dependent speech recognition templates, with the corresponding need for multiple speech samples to train each name in a personal dialing directory, has made Internet based management of existing voice dialing systems of the type described above difficult to implement.
While existing Centrex and voice dialing services are useful, it is desirable that such services continue to be improved and enhanced. With regard to voice dialing, it is desirable that new methods and apparatus be devised which would allow for Internet based management of voice dialing services. It is also desirable that new methods of providing voice dialing services be devised which will allow voice dialing services to be implemented as part of AIN based service packages such a Centrex. With regard to Centrex, it is desirable that Centrex service be enhanced to support voice dialing functionality as well as Internet based management of said functionality.
The present invention is directed to methods and apparatus which can be used to provide voice dialing, enhanced Centrex, and other communications services. In various embodiments, the communications services of the present invention are implemented to facilitate easy management of the services by end users via the Internet. In accordance with one feature of the present invention, voice dialing is implemented as an AIN based service. By implementing voice dialing as an AIN based service, the service can be easily integrated with Centrex and/or other AIN based services. In addition, since control logic in an ISCP is used to control access to voice dialing hardware, in response to activation of one or more AIN triggers set telephone switches, access to voice dialing hardware can be controlled to provide efficient use of the voice dialing hardware regardless of the location from which a voice dialing service subscriber calls. In addition, ISCP logic can translate multiple identifiers, e.g., various telephone numbers, associated with a subscriber, into a single user identifier which can be used by voice dialing hardware for subscriber information retrieval purposes.
In accordance with an exemplary voice dialing embodiment of the present invention, each subscriber is able to maintain and update a voice dialing record, used to provide voice dialing services via the Internet. The subscriber""s voice dialing record, sometimes refereed to as a voice dialing directory, may include such information as a user identifier, e.g., a Centrex telephone number associated with the subscriber, a mobile telephone number associated with the subscriber, an additional telephone number used by the subscriber, information on one or more individuals or parties to be called and, optionally, information identifying a corporate voice dialing record to be used if a spoken name or nickname is not found using speech recognition models included in the subscriber""s voice dialing record.
In accordance with the present invention, a voice dialing call may be placed to an individual or party by speaking either the name or a nickname of the individual or party being called. In addition, if desired, a location associated with the party or individual may also be specified in addition to the party or individuals name. When a location is stated in conjunction with a name, the call will be placed to the party or individual at the particular specified location.
To support such voice dialing functionality, a voice dialing record is maintained for each subscriber. Information relating to each party or individual who may be called is stored within the subscriber""s voice dialing record in the form of a calling entry. The subscriber is responsible for providing and updating the information included in each calling entry. A calling entry normally includes the name of the party or individual who may be called, an optional nickname for the party, and a first telephone number associated with the named party or individual. Optionally one or more additional phone numbers may be associated in the calling entry with the named party or individual. When multiple telephone numbers are associated with a named party or individual, a telephone number identifier, e.g., the name of a location, is normally associated in the calling entry with each telephone number. Examples of telephone number identifiers which may be associated with a telephone number include, for example: Home, Office, Office 2, Mobile, Secretary, etc. In some embodiments the subscriber is allowed to define telephone number identifiers. For example, the subscriber may associate the identifier xe2x80x9cgirl friend 2xe2x80x9d with the telephone number of a second girl friend.
A subscriber""s voice dialing record is stored in part of the public telephone system in accordance with the present invention, e.g., in an intelligent peripheral device, e.g., voice dialing (VD) IP, coupled to a central office telephone switch. In addition to being coupled to the telephone network via the central office switch, the VD IP is coupled to the Internet via one or more servers. Various security measures are taken, including the use of a PIN, to insure that unauthorized individuals, e.g., non-subscriber""s, are not given access via the Internet, to subscriber""s voice dialing records. Alternatively, a customized desk top application, as opposed to a Web Browser, can be used to access and update a subscriber""s voice dialing record information via the Internet. Given that the VD IP is coupled to both the telephone network and the Internet, two methods of accessing a subscriber""s records are possible.
To provide a subscriber with the greatest flexibility with regard to maintaining and updating his/her voice dialing directory, a subscriber is provided the opportunity to update the subscriber""s voice dialing directory by telephone using voice/DTMF input and, alternatively, though an Internet connection. A Web browser such as Internet Explorer, operating on a subscriber""s computer can be used to display voice dialing record information and for providing updated information, e.g., in the form of text input, to the VD IP in which the subscriber""s voice dialing record is stored.
In addition to the calling record, e.g., calling entry information discussed above, which is provided by the subscriber, each subscriber""s calling record normally further comprises a speech recognition model for each name and nickname included in the subscriber""s record. The use of speaker dependent speech recognition models for the recognition of one or more names included in a voice dialing record is contemplated and possible. However, in order to implement a voice dialing system that is easy to update and maintain via text obtained from the Internet, speaker independent speech recognition models are used for most, if not all, names and nicknames included in a subscriber""s voice dialing record. The speech recognition models for names entered via the Internet are generated from the text of the name using speaker independent modeling techniques. This generally involves performing a text to phoneme conversion operation. It also involves generating a speaker independent speech recognition model from the phonemes produced from the text of the name or nickname being processed.
Speaker independent speech recognition models produced from text work well for most names. However, in some cases where the pronunciation of a name is difficult to predict from its spelling, e.g., due to a variety of possible pronunciations or because it originates from a foreign language name, a speaker independent speech recognition model produced from text may provide recognition results which are less than satisfactory.
In order to address this potential problem, the methods and apparatus of the present invention allow for a subscriber to provide one or more speech samples of a name to be used for generating a speech recognition model for voice dialing purposes. The speech corresponding to a name may be provided as part of a telephone based voice dialing entry creation or updating process. In the case where an existing entry is being updated using a spoken version of a name, a speaker independent (or, optionally speaker dependent) speech recognition model is generated from the speech corresponding to the name. The generated speech recognition model is stored in place of a speech recognition model previously generated from text when such a text based model exists.
In the case where speech corresponding to a name is being used to create a voice dialing entry, as opposed to update an existing entry, the speech corresponding to the name is used to generate a speech recognition model. A speech recognition operation is also performed on the speech and a text version of the spoken name is generated and used to populate the (text) name portion of the calling entry corresponding to the spoken name. The text name entry is displayed to a user accessing the calling entry information via the Internet. A user can edit, via the Internet, the spelling of the name in the entry if desired without affecting the speech recognition model generated from the supplied speech.
To distinguish between speech recognition models generated from speech and those generated from text, model type information, e.g., identifying whether the model was generated from text or speech, may be stored in conjunction with the speech recognition models included in a voice dialing directory. In one particular embodiment, altering of the text version of a name normally does not affect the corresponding speech model when the model was generated from speech. However, in such an embodiment altering of the text version of a name does cause updating of the speech model when the model is generated from text.
As a result of the above discussed model generation processes, a subscriber""s voice dialing record may include name speech recognition models generated from text as well as ones generated from speech samples.
The information in a subscriber""s voice dialing record is used by the voice dialing IP to initiate calls in response to a subscriber""s speech. As part of a voice dialing operation information from a subscriber""s voice dialing record is retrieved from memory using a user, e.g., subscriber, ID to identify the subscriber record. In one embodiment, the subscriber s Centrex telephone number is used as the User ID. Thus, in cases where the subscriber attempts to place a call using speech from his/her Centrex phone line, the subscriber""s ID can be obtained using automatic number identification (ANI) techniques.
In order to allow a subscriber to call from any one of a plurality of telephones various procedures for identifying the caller and providing the voice dialing IP with the subscriber""s ID are supported. Accordingly, a voice dialing service subscriber using a mobile phone, phone associated in the customer record with the subscriber, or another phone, can be coupled to the voice dialing IP where his/her voice dialing record is accessible and obtain a voice dialing service there from. In most cases where the subscriber initiates a voice dialing call from a phone line other than his/her Centrex line, the telephone number from which the person is calling is identified and cross-referenced with stored information associating the identified phone number, e.g., mobile number, with a Centrex telephone number. In cases where the subscriber is calling from a number for which there is no stored information associating the phone number with a Centrex phone number, the caller is requested to supply his/her Centrex telephone number. An integrated service control point (ISCP), accessed in response to activation of an AIN trigger, may be used for associating telephone numbers from which a subscriber is calling with his/her Centrex number which is used as the user ID supplied to the voice dialing IP.
The voice dialing IP receives from the subscriber speech which is to be used to place a voice dialing call. The speech may include the name or nickname of a person to be called. It can also include a location associated with the name or nickname of the person or party to be called.
Upon recognizing a name in received speech, the voice dialing IP plays the voice dialing subscriber a confirmation message such as, e.g., xe2x80x9cDialing John Smithxe2x80x9d where xe2x80x9cJohn Smithxe2x80x9d is the recognized name. This provides the subscriber the opportunity to stop the call in the event a recognition error occurred, e.g., by hanging up or otherwise signaling the voice dialing IP. A voice recording of the recognized name may be played to the subscriber as part of the confirmation message, e.g., recording of the subscriber""s voice obtained at the time the subscriber trained a speech recognition model in his/her voice dialing directory using speech. However, in accordance with one feature of the present invention, a text representation audio corresponding to a recognized name is generated for message confirmation purposes from a stored text version of the name. In such an embodiment, a text to speech circuit is used to generate an audio version of a name from a stored text version when an audio version of the name is needed for a confirmation message. Since storage of text is much more efficient than the storage of voice recordings, the use of text versions of names for confirmation message purposes can be considerably more efficient from a memory perspective than the use of audio recordings.
In response to recognizing a name or nickname but not a location in speech, the voice dialing IP provides to the telephone switch to which it is coupled the default, e.g., first, telephone number, in the subscriber""s voice dialing record, corresponding to the identified name or nickname. If a location is recognized in the received speech, in addition to a name or nickname, the telephone number corresponding to the name or nickname, and the detected location is supplied to the telephone switch. After supplying the called party telephone number information to the switch and playing the subscriber an audio confirmation messaged including at least a portion of the recognized name, the audio connection between the voice dialing IP and the caller is terminated freeing the voice dialing IP resources to be used to service other subscribers. The telephone number supplied by the VD IP to the telephone switch is used by the switch to place a call to the specified number.
In accordance with one embodiment of the present invention, the results of a voice dialing operation are monitored to detect various potential call outcomes. That is, a determination is made as to whether or not a busy signal is encountered, the called party does not answer, or if the call is successfully completed to the destination number (e.g. whether the called party answers the call).
If a no answer or busy signal condition is detected, a service control point (SCP) associated with the voice dialing subscriber is contacted for call processing instructions. In various embodiments, the SCP is implemented as an integrated service control point (ISCP). The contacted SCP then causes the subscriber to be reconnected to the VD IP. The VD IP informs the caller via an audio message of the call outcome and offers the caller the opportunity to place another voice dialing call if desired.
Thus, the present invention provides a voice dialing subscriber the opportunity to place multiple sequential voice dialing calls without having to hang-up when a busy signal or no answer condition is encountered. Because the voice dialing IP is disconnected from the caller between voice dialing call attempts, voice dialing and speech recognition resources are used in an efficient manner and are not tied up during the time required to detect the outcome of a call.
Various additional features and advantages of the present invention will be apparent from the detailed description which follows.