Given the dramatic increase in the availability of various types and quantities of information and a sharp decrease in time and/or availability of traditional facilities to access such information, individuals currently desire to be able to access, act on, and/or transform any information from any device at any time. In the case of the Internet, for instance, large quantities and varieties of information are available, however, traditionally the Internet was mostly supporting only devices that access information using a HyperText Markup Language (HTML) browser on top of a HyperText Transport Protocol (HTTP) network. This was provided on top of TCP/IP (Transmission Control Protocol/Internet Protocol).
Solutions to this problem centered around rewriting application programs used to access such information so that the information could be accessed in other ways. One solution led to the development of the Wireless Application Protocol (WAP), see, http://www.mobilewap.com. WAP is equivalent to HTTP for a wireless network. A Wireless Markup Language (WML) was developed which is equivalent to HTML for a wireless network. Thus, similar to how HTML is used on top of HTTP. WML is used on top of WAP. WAP and WML allow a user to access the Internet over a cellular phone with constrained screen rendering and limited bandwidth connection capabilities. CHTML is another example of a ML (markup language) addressing this space.
Next, more recently came the development of a mechanism for bringing the Web programming model (also known as fat client programming model) to voice access and, in particular, to telephone access and Interactive Voice Response (IVR) systems. Such a mechanism is typically known as a speech browser (or voice browser). Such a speech browser is described in the above-referenced U.S. provisional patent application identified as U.S. Ser. No. 60/102,957. The speech browser may use a speech based variation of the Extensible Markup Language (XML) known as VoiceXML, see, e.g., http://www.voicexml.org. The speech browser can also operate on top of the WAP protocol and in conjunction with exchanges of WML data.
However, such an approach poses certain problems for application programmers, if they want to offer multi-channel support: offer access to web browsers (HTML browsers), phones (voice browsers) and wireless browser (WML) or multi-modal/conversational browsers, as defined in the aforementioned disclosures. First, with this approach, the application programmer must deal with at least three different languages when developing an application, e.g., HTML, WML and VoiceXML. That is, the application must account for the fact that since a user is going to be accessing Internet based information via a speech browser over a conventional telephone, or over a wireless connection using a WAP browser or using a conventional web browser, HTML, WAP and VoiceXML must be employed when writing the application. This is known to be quite burdensome to the application developer. Secondly, with this approach, there is no suitable way to synchronize multi-modal applications, for example, applications that provide for both visual and speech based user interaction with the browser or browsers employed to access the application.
Applications have traditionally been developed such that both content (i.e., information or other data) and presentation (i.e., manner in which the content is presented to the user) were mixed. However, in an attempt to simplify application programming, an effort was made to separate content from presentation. This led to the development of the Extensible Stylesheet Language (XSL) which operates in conjunction with XML such that content associated with an application is stored in XML and the transformations necessary to present the content on a specific device are handled by XSL, see, http://www.w3.org/Style/XSL. Such approach has been adopted by the W3C (World Wide Web Consortium). This approach is typically used to adapt the presentation to the characteristics of the main browsers (e.g., different versions of Microsoft Internet Explorer, Netscape Communicator/Navigator, other less popular browsers, etc.). Some have tried to extend this use to other modalities/channels (e.g., wireless browser supporting a format like WML on top of embedded devices (wireless phone or PDA)). This last approach has never been very successful or convenient and in any case it requires multiple authoring of the XSL pages. However, this approach has the disadvantage of being both application and device/channel dependent. That is, XSL rules are dependent on the application and device for which the content is to be transcribed. Thus, if an application is to be accessed from a new device, new XSL transformations must be written for that device.
Other attempts to overcome some of these problems have been made. There have been attempts to provide an XML model based on user intention (complex and generally task oriented intentions). User intentions may be modeled with complex components that can not, or are very difficult to be, rendered on devices with small screens or with speech. These complex components, not decomposed into smaller atomic components, can also not be tightly synchronized across modalities. Tags independent of the device are offered which are rendered by different browsers. Also, some extensions to speech interactive voice response (IVR) systems have been proposed. However, among other deficiencies, these attempts do not model dialog and transcoding from modality to modality is generally an impossible task.
In these approaches, user intentions are modeled with complex components that describe complex interactions. However, they are typically application-specific. That is, they depend, characterize, or directly involve business logic concepts and elements. Therefore, in that case, the same way that XSL rules (and XSL style sheets) are today fundamentally a function of the application or application domain (i.e., the nature of the XML attribute involved), the XSL rules used to transform pages written with theses languages are also fundamentally a function of the application or application domain. They must be re-written for each new application. This characterizes the limitation of these approaches. These approaches do not contribute in helping to offer access to content, independent of the access modality. Indeed, these approaches only allow access to content related to this application or application domain. Any other case requires rewriting the transformation rules. Thus, there is a need to free transformation rules from the backend application and to make it depend only on characteristics/modalities supported by the access device or channel.
Note that in some cases, support of multiple channels has been achieved by using cascades of stylesheets and treating the resulting XML stream as serialized internal APIs (Application Programming Interfaces). Again, this requires multiple authoring.
In addition, the above approaches result in having very complex intention models with such components that do not have corresponding rendering appropriate in modalities like WML. It is apparent that these models were designed to offer the capability to customize the graphical user interface (GUI) presentation to requirements of different types of display (i.e., essentially within variations of the same channel and modality) or browsers. As a result, none of these approaches appropriately model and treat speech or multi-modal user interfaces.
As already mentioned, conventional transcoding (XSL rules used to present the XML content and change of XSL style sheet to go from one modality to another) has been considered to support different access modalities. This means that for a given XML content, by changing the XML rules, the system can produce an HTML page, an WML rule, or even a VoiceXML page, etc. Actually, this is what is being used today to support the different web browsers on the market, e.g., Netscape Communicator, Microsoft Internet Explorer, Sun Microsystems Hot Java, Spyglass browser, Open Source Amaya browser/editor, etc. Unfortunately, this is possible only if:
(i) The XSL rules are application or application domain specific (i.e., the nature of the XML attribute); and
(ii) Transcoding is between two languages, for example HTML to WML, and the original content has been built in HTML while following very strict rules of authoring. Indeed, this is enforceable only if within a given company, for a given web site. Even in those cases, it is hardly implementable, in general, because of missing information across markup languages or modalities in order to provide the corresponding components in other modalities (e.g., an HTML form or menu does not provide the information required to render it automatically by voice) as well as different dialog navigation flows in different modalities.
Accordingly, there is a need for an application programming language and information browsing mechanisms associated therewith which overcome these and other shortcomings attributed to existing languages and browsers.