With the popularization of the Internet, the world of Web browsing that displays documents, which are held in servers connected to the Internet and are described in HTML (Hyper Text Markup Language), on browsers of the personal computer is still growing.
In an HTML document, fields that describe a document structure and fields that describe an expression style are mixed due to its historical circumstance. As a format that separates the structure and style, CSS (Cascading Style Sheet) which extracts the expression style from the structure is also prevalent.
Even when CSS (expression style) is separated from HTML (document structure+expression style), since the document structure of HTML is designed in consideration of the expression style, a method of describing in XML (extensible Markup Language) that expresses only the tree structure of document contents, and XSL (extensible Stylesheet Language) that converts the tree structure into an expression style to be expressed is also spreading.
Java Script, VBScript, and the like are available as script languages which describe manipulations in HTML, and can describe in an HTML file or can be inserted as an external file.
FIGS. 32 and 33 show examples of documents described using XML and XSL, and FIGS. 34 to 37 respectively show examples of an HTML document and CSS file generated based on these documents, an example of a JavaScript file, and a display example of a browser.
Note that this browser is installed on a versatile computer such as a personal computer or the like. When the browser is launched in response to user's operation using an input device such as a mouse, keyboard, or the like, its browser window is displayed on a display.
When the user presses a “reverse” button 2501 on a browser window 2500 in FIG. 37, the contents of onClick in FIG. 34 are called, and the background and text colors in the browser window 2500 are reversed by a function (reverseColor ( )) described in JavaScript shown in FIG. 36.
As described above, when various stylesheets such as CSS, XSL, and the like are prepared, and are switched as needed, a single XML document that indicates only the tree structure of the document contents can be switched in accordance with purposes intended. Also, a script language can describe actions to be taken upon, e.g., depression of buttons.
Meanwhile, the performances of not only personal computers but also mobile terminals such as portable phones, PHS (Personal Handyphone System), PDA (Personal Digital Assistant), and the like, that users carry about everyday, have improved, and high-end mobile terminals have processing performance equivalent to that of a personal computer one generation before.
Such high-end mobile terminals have the following features.
(1) Mobile terminals can establish connection to a host computer via a public network or wireless LAN and can make data communications.
(2) Most of mobile terminals have audio input/output devices (microphone, loudspeaker, and the like).
However, the high-end mobile terminal has a small GUI (Graphic User Interface) window, i.e., poor display performance of GUI information. As commercially available mobile terminals, many non-high-end mobile terminals are present in addition to high-end terminals, and some of such mobile terminals cannot display GUI information.
Under such circumstances of mobile terminals, it is significant to implement a multimodal interface that can make some or all of operations and responses by means of speech.
Upon dealing with multimodal documents, some high-end mobile terminals can perform speech recognition and speech synthesis. However, the rest of mobile terminals cannot perform speech recognition and speech synthesis or can perform only poor speech recognition and speech synthesis.
However, technical advances and size reductions of hardware may allow models, which cannot perform an audio process so far, for the audio process. Most of mobile terminals have a small display screen due to portability. However, with the advent of high-definition display screens and lightweight hardware, even a mobile terminal can display many kinds of information on its display screen.
Furthermore, even with a mobile terminal having both audio and GUI modalities, the user may want to make a speech-only process in an environment in which he or she can hardly operate GUI or to make a GUI-only process in an environment in which he or she does not want to use speech inputs/outputs.