In recent years, as network technologies and Internet infrastructure become widespread, a computer user has got to display a web page visually on a screen of a computer via the Internet and acquire various kinds of information. In an existing web page creation method, it is considered first that the web page provides the user with the information visually.
However, with respect to information processability, information acquirability and quickness thereof in a personal computer, the existing web page aimed to provide a graphical user interface has been regarded to have various difficulties. For example, typically the web page is displayed as the graphical user interface on the screen of the computer. In this case, a user who may not recognize the web page visually (hereinafter, in the present invention, referred to as a non-visual access user), may not have enough access to the graphical user interface displayed as the web page, or may not have any access thereto at all, and thereby a disadvantage may occur that the non-visual access user cannot acquire an important content, for example, a main content to be provided through the web page.
As described above, considering usability of the existing web page from the viewpoint of the non-visual access user, it is difficult to say that the existing web page has enough accessibility. Reasons therefor may include, for example, that the non-visual access user may not access the content directly by using pointers and icons, which have their positions controlled by pointer means such as a mouse, a stylus pen, keyboard operations and a joystick, and are displayed as shapes such as arrows and the likes; that two-dimensional to one-dimensional spatial recognitions are completely different from those for a visual access user; and that even if the important main content is highlighted, it may not be recognized by the non-visual access user, and the like.
In order to improve the above described disadvantage even partly, conventionally, a voice response system has been proposed which generates a structured document, such as a text, HTML (Hyper Text Mark-up Language), DHTML (Dynamic Hyper Text Mark-up Language), SGML (Standard Generalized Mark-up Language) and XML (extensible Mark-up Language), as a synthetic voice via a voice synthesis system, and provides it for the non-visual access user by means of a microphone. However, navigation only via voice has had a disadvantage that it may not ensure acquirability of the main content, because it takes a long time for the above described user to reach the main content among contents displayed on the web page, or because the user eventually may not reach the required content.
In addition, for the above described purpose, in a voice browsing system, the voice response system and the like, VoiceXML and the like are used to create the web page including the content suitable for each system to provide services. However since only limited information is provided in such services, these systems may not make a vast amount of information on the web available effectively to the non-visual access user.
The voice browsing system proposed conventionally, with respect to the non-visual access user, uses the same browser as that for an ordinary user (Internet Explorer® or Netscape Navigator®) to access the web page, by means of a voice browser or a screen reader installed at a user site. The conventional voice browsing system has provided the user with the voice navigation by extracting only text information which may be outputted via voice and speaking a file including a word such as “HTML” for example, sequentially from its beginning.