The relatively recent development of new and expanded telecommunication services has provided subscribers increased flexibility in the selection and use of the various features that have become available. These services are amenable to being tailored to specific requirements of the subscriber.
So-called "flash hook services," such as Call Waiting, Tone Block, 3-Way Calling, Call Transfer and Consultation Hold, are implemented using appropriate switch hook depression by the user. Other services, such as Return Call, Answer Call, Repeat Call, Priority Call, Call Trace, Per Call Blocking, Intercom Extra, Home Intercom, Speed Calling and Call Block, require various combinations of keyed DTMF inputs by the subscriber. Such inputs, for example, may be codes including special keys such as the * key in combination with a preset number sequence or dialing a dialed system telephone followed by further keyed input for purposes of identification or choice of options.
Flash hook services, while offering a wide range of communication customization, are particularly complex. Traditionally, depression of the switch hook disconnects a call. However, momentary switch hook depression with the newer flash hook services effects different results, such as connecting new callers. Without prompts and feedback from the system, inexperienced users tend to lose their ability to track their location while adding, transferring or dropping a caller in a multiple call, flash hook operation. Confidence is low in the ability to complete the service and the user often expects to lose the connection to the other party.
In addition to such disadvantages, the user of flash hook operations must be cognizant of the correct DTMF key combinations for each of the various services, the appropriate sequences for inputting key combinations in various complex services, and the appropriate responses from the communications network that either signal the next step in the process or verify completion of the process. A burden is placed on the subscriber to remember the appropriate activation and deactivation codes for the various subscribed services. Flash hook operations can be not only complex but time consuming.
These drawbacks extend to preprogrammable functions included in a subscriber's telephone equipment as well as those provided by the telecommunications system. For example, speed dial features that the user may set up when first obtained may be abandoned later when instructions are not at hand due to the complexity of the entry process. As a result, speed dial keys may not be fully populated or may include obsolete entries.
More recently, Common Channel Signaling has been utilized advantageously by the Advanced Intelligent Network (AIN) of the public switched telephone system to predefine services according to the subscriber's requirements and to implement such services for applicable calls. A description of an Advanced Intelligent Network (AIN) implementation may be found, for example, in U.S. Pat. No. 5,247,571 to Kay et al. Each central office of a network of interconnected central offices is connected to a number of local telephone lines constituting a specified group. Call routing is carried out in accord with data stored in the AIN database and with customer specified parameters, such as calling/called party number, time-of-day, day of the week, authorization codes, etc. After the central office switching system detects an off-hook, it determines whether or not the call originates from a subscribing line. If not, the system receives dialed digits and executes normal call processing routines. If the call is from a subscriber line, the originating office receives dialed digits, suspends the call and sends a query message to the Integrated Service Control Point (ISCP) through the Signaling Transfer Points (STP's). This query message, in Transaction Capabilities Applications Protocol (TCAP) format, identifies the calling station and the digits dialed as well as other pertinent information. Based on the identity of calling party's address, the ISCP retrieves from its database a table of trunk group routing information. The ISCP formulates a response message, again in TCAP format, including the routing information, and transmits the response message back to the originating central office via the STP(s). The system then executes normal call processing routines for completing the call using the received routing information provided by the ISCP.
The use of AIN reduces the number of DTMF entries that a subscriber must input as much of the information needed for providing the service has been stored in the AIN database. For those services that require a significant amount of caller input, interactive voice menus are used to prompt callers in a user friendly manner. Nevertheless, inherent drawbacks exist in situations in which the caller must provide DTMF input. The subscriber often finds it difficult to remember the proper DTMF representations of the required input, such as a multiplicity of telephone numbers and codes, and may be inconvenienced by the time and steps necessary to follow a menu driven procedure in order to complete the desired service.
The use of speech recognition is an attractive approach to alleviate such annoyances. As the development of commercially available speech recognition systems has progressed, voice responsive features have been provided in telephone services. Prior examples of telephone devices that are responsive to caller voice input to dial a call to a corresponding destination are U.S. Pat. No. 4,928,302, issued to Kaneuchi et al., and U.S. Pat. No. 4,961,211, issued to Marui et al. The Marui et al. device is a mobile telephone apparatus that makes an outgoing call in response to the caller speaking a number that corresponds to the destination telephone number. The telephone number is read out from stored telephone numbers and is then dialed. When a number has been identified, it is synthesized and displayed so that the user can determine if it is the correct number. In the Kaneuchi et al. device, standard patterns are associated with registered telephone numbers.
U.S. Pat. No. 5,165,095, issued to Borcherding, and U.S. Pat. No. 5,369,685, issued to Kero, disclose voice activated dialing systems in which remote databases are referenced. In the Borcherding arrangement, a local database contains speaker independent voice recognition templates for various command functions and a remote database in which speaker dependent templates are stored. The latter templates represent phrases associated with destination telephone numbers. If a dial command is spoken by a caller, a local database containing speaker independent speech recognition templates is accessed. The templates of this local database are compared to a dial command so that dialing instructions can be recognized and executed. The caller is identified and speaker dependent templates for the identified caller are downloaded from the remote database. The speaker dependent templates are then accessed. A spoken destination identifier is compared with the speaker dependent templates and when a match is found, the destination telephone number is dialed.
In the Kero arrangement, a voice activated telephone directory and call placement system accessible over a telecommunications network allows a caller to store a personalized telephone directory and to retrieve selected directory listings therefrom by speaking a series of voice entries. A plurality of subdirectories are formed to complete the listings. A call-spoken entry received over the network is compared with a previously stored voice template of the caller speaking the name of a subdirectory that is included as part of the caller's personalized telephone directory. If a match with a subdirectory name template is made, a subsequent caller-spoken entry received over the network is compared to a voice template of listings in the subdirectory. The system retrieves the destination telephone number associated with the directory listing if a match is found and the call may then be completed. Each subdirectory may include subordinate levels of subsidiary directories, each having a plurality of listings.
Speech responsive dialing systems such as those of the prior art exemplified above have inherent limitations. The large storage required for templates of either speaker dependent recognition or speaker independent recognition vocabularies is a restrictive factor as the number of users and the vocabulary size increase.
Development of speaker independent templates involves, for each vocabulary word, the input from many diverse speakers in order to provide reliably accurate recognition. Such templates occupy a large volume of storage. As recognition must accommodate speakers of different accents, inflections, and pronunciation, the size of the word vocabulary must be limited to avoid confusion among similar words. A small number of words may be recognized with confidence, while a large number would give an unacceptably erratic response. In addition, provision must be made in the system to distinguish between use by different callers of the same word, for example "mom," for different destinations.
Speaker dependent recognition requires developing templates for each user. While these templates individually would occupy less storage volume than speaker independent templates for corresponding words, templates must be trained and stored for each word to be used by each user. Users in the same household who would use the same vocabulary word for the same destination number nevertheless would be required to go through a template training process. Moreover, in order to access the appropriate templates, provision must be made in the system for identifying the particular user.