In today's information age, a user may use one or more devices to retrieve desired information. For example, a user may use a cellular phone, personal digital assistant, lap top, etc., to retrieve information from a content provider over one or more networks. These devices may include multimodal applications that receive/transmit information through one or more channels (i.e., paths on which signals may flow). For example, an application may transmit voice on one channel, text on another channel, images on another channel, etc. Different requests may also travel on different channels. On the other hand, an application may transmit both voice and data on the same channel.
Regardless of the number of channels, these multimodal applications exemplify the requirement of coordinated input from one or more devices. For example, a user may input part of a single request through a first modality, and the rest through a second modality. Without a system to recognize that these request fragments contribute to a single request (i.e., the input is not coordinated), it is not possible to correctly service the request. Therefore, regardless of whether fragments of a request, fragments of a response, fragments from a “front-end,” or fragments from a “back-end” are received, coordination of the fragments into a single context allows for the proper servicing of the data composed from the fragments.