1. Field of the Invention
The present invention relates to the field of automated speech systems and, more particularly, to switching between modalities in a speech application environment extended for interactive text exchanges.
2. Description of the Related Art
Interactive Voice Response (IVR) systems are often used to provide automated customer service via a voice channel of a communication network. IVR systems permit routine customer requests to be quickly, efficiently, and automatically handled. When a request is non-routine or when a caller has difficulty with the IVR system, a transfer can be made from the IVR system to a customer service representative. Even when human interactions are needed, the IVR system can obtain necessary preliminary information, such as an account number and a reason for a call, which can ensure callers are routed to an appropriate human agent and to ensure human-to-human interactive time is minimized. Successful use of IVR systems allows call centers to be minimally manned while customers are provided a high level of service with relatively low periods spent in waiting queues.
IVR systems, especially robust ones having natural language understanding (NLU) capabilities and/or large context free grammars, represent a huge financial and technological investment. This investment includes costs for purchasing and maintaining IVR infrastructure hardware, IVR infrastructure software, and voice applications executing upon this infrastructure. An additional and significant reoccurring cost can relate to maintaining a sufficient number of voice quality channels to handle anticipated call volume. Further, each of these channels consumes an available port of a voice server, which has a limited number of costly ports. Each channel also consumes a quantity of bandwidth needed for establishing a voice quality channel between a caller and the IVR system.
One innovative solution for extending an IVR infrastructure to permit text-based interactive services is detailed in co-pending patent application Ser. No. 11/612,996 entitled “Using an Automated Speech Application Environment to Automatically Provide Text-Based Interactive Services.” More specifically, the co-pending application teaches that a chat robot object, referred to as a Chatbot, can dynamically convert text received from a text exchange client to input consumable by a voice server and can dynamically convert output from the voice server to text appropriately formatted for the client. From a perspective of the voice server, the text-based interactions with the text exchange client are handled in the same manner and with the same hardware/software that is used to handle voice-based interactions. The co-pending solution allows for a possibility of switching between modalities, without interrupting a pre-existing communication session, which is the subject matter of this application.
It should be appreciated that conventional solutions for providing voice and text exchange services implement each service in a separate and distinct server. Each of these servers would include server specific applications tailored for a particular modality. For example, a VoiceXML based application controlling voice-based interactions can execute on a speech server and a different XML based application controlling text-based interactions can execute on a text exchange server.
Any attempt to shift from a text session to a voice session or vice-versa would require two distinct servers, applications, and communication sessions to be synchronized with each other. For example, if a voice session were to be switched to a text session, a new text session would have to be initiated between a user and a text exchange serve. The text exchange server would have to initiate an instance of a text exchange application for the session. Then, state information concerning the voice session would have to be relayed to the text exchange server and/or the text exchange application. Finally, the speech application executing in the speech server would need to be exited and the original voice session between the speech server and a user terminated.
No known system or set of systems provides a dynamic intra-communication session, modality switching capability that would permit switching from a text exchange modality to a voice modality and vice-versa. Further, no known teachings exist concerning even a desirability to dynamically switch between a text exchange modality and a voice modality during an automated communication session, possibly due to assumed complications believed to be inherent with such a capability.