In telephony applications, text-to-speech (TTS) systems may be utilized in the production of speech output as part of an automatic dialog system. Typically during a call session, TTS systems first transcribe the words communicated by a caller through a speech recognition engine. A natural language understanding (NLU) unit in communication with the speech recognition engine is used to uncover the meanings behind the caller's words. These meanings may then be interpreted to determine the caller's requested information. This requested information may be retrieved from a database by a dialog manager. The retrieved information is passed to a natural language generation (NLG) block which forms a message for responding to the caller. The message is then spoken by a speech synthesis system to the caller.
A TTS system may be utilized in many current real world applications as a part of an automatic dialog system. For example, a caller to an air travel system may communicate with a TTS system to receive air travel information, such as reservations, confirmations, schedules, etc., in the form of TTS generated speech. To date, the quality of TTS systems has been at such a level that it has been clear to the caller that communication was taking place with an automated system or machine. As TTS systems improve, however, callers may become more likely to believe that they are communicating with a human, or callers may have some doubt as to whether a response during communication came from an automated system. Therefore, due to such confusion concerns, it would be beneficial for callers to be informed about whether they are requesting and receiving information from a machine or a human operator.
Using the technology presently available in TTS systems, the only way to convey information regarding the nature of the communication is to explicitly identify the machine as such during the conversation, preferably at the beginning. For example, the TTS system may provide a message such as “welcome to the automated answering assistant,” or “this is not a human.” While these messages may be enough to avoid confusion in some situations, the caller may not pay attention to the message, forget about the message later in the call, or not understand a more subtle message.