An emerging area of technology involving terminal devices, such as handheld devices, mobile phones, laptops, PDAs, internet appliances, desktop computers, or other suitable devices, is the application of multi-modal interaction for access to information and services. Typically resident on the terminal device is at least one browser, wherein the browser is a program which allows the user to enter fetch requests, receive fetched information, and navigate through content servers via internal, e.g. intranet, or external, e.g. internet, connections, and present information to the user. The browser may be a graphical browser, voice browser, JAVA® based application, software program application, or any other suitable browser as recognized by one of ordinary skill in the art.
Multi-modal technology allows a user to access information, such voice, data, encryption, video, audio or other information, and services such as email, weather updates, bank transactions, and news through one or more browsers. More specifically, the user may submit an information fetch request in one or more modalities, such as speaking a fetch request into a microphone, and the user may then receive the fetched information in either the first mode or a second mode, such as viewing the information on a display screen. Within the terminal device, the browser works in a manner similar to a standard web browser or program application, such as NETSCAPE NAVIGATOR®, resident on a computer connected to a network. The browser receives an information fetch request, typically in the form of a universal resource indicator (URI), a bookmark, touch entry, key-entry, voice command, etc. The browser then translates the information fetch request and sends this request to the appropriate content server, such as a commercially available content server, such as a weather database via the internet, an intranet server or any other suitable network. The information is then provided back to the browser, typically encoded as mark-up language for the browser to decode, such as hypertext mark-up language (HTML), wireless mark-up language (WML), extensive mark-up language (XML), Voice extensible Mark-up Language (VoiceXML), Extensible HyperText Markup Language (XHTML), or other such mark-up languages.
In multi-modal communication, each browser may directly fetch the requested information from the content server. Wherein, each browser may access the same content server at the same time for the same requested information, to provide synchronization between the browsers. This increases the number of “hits” on a content server, reduces available system bandwidth, and can increase costs and decrease efficiency of the multi-modal system. Therefore, it may be more efficient to cache the requested information at an intermediate memory location, such that the content server may be accessed once, and the other browsers may then access the intermediate memory location.
Typically, a computer resident on a network fetches the information request through a proxy server commonly known as a firewall server. Wherein, a proxy server is a computer having a proxy, an application running on a gateway that relays packets of information between a trusted client, such as the networked computer, and an untrusted host, such as the third party content server. The proxy server may act as the intermediate memory location for the multi-modal system.
Generally, a browser has a static proxy address that is independent of a particular session. When the browser is first installed on a terminal, computer or other device, a browser proxy address is assigned and manually inserted therein, via a graphical user interface (GUI). Moreover, the proxy address may be manually changed by a user via a GUI, after installation. Typically, the proxy address refers to a specific proxy server, such as a firewall server, allowing a user to safely access information from the various content servers. Therefore, whenever a browser receives a URI request, that request is transmitted through the static proxy server.
Concurrent with the emergence of multi-modal technology, concerns arise regarding different types of browsers (e.g. graphical, voice, etc.) seeking information from a variety of different content servers. If a first browser, such as a graphical browser, in the terminal device retrieves a specific set of information, it is important to synchronize the second browser, such as a voice browser on the network device, of the first browser's fetch request and successful retrieval. If the different browsers are not synchronized properly, a user may encounter problems when switching between browsers or when using multiple browsers to input commands or fetch requests.
A proposed solution is the emergence of a multi-modal synchronization coordinator, which provides synchronization for multiple browsers in a multi-modal system. Even with the synchronization of the browsers within the multi-modal system, a problem still arises due to the browsers' generally static multi-modal proxy address. In addition to possible available bandwidth problems as discussed above, the user is required to provide information fetch requests through a static multi-modal proxy server, regardless of the user's location. For example, if the browser on a mobile phone has a statically assigned proxy server that is located in Chicago, but the mobile phone is being used in Atlanta, then the information fetch request from the browser has to be sent through the proxy server located in Chicago and then routed back to Atlanta. This may reduce system efficiency.
As such, there exists a need for an improved multi-modal proxy device and method.