1. Field of the Invention
The present invention generally relates to data processing, and more specifically, to methods of programmatically reading web page content.
2. Description of the Related Art
Computer networks were developed to allow multiple computers to communicate with each other. In general, a network can include a combination of hardware and software that cooperates to facilitate the desired communications. One example of a computer network is the Internet, a sophisticated worldwide network of computer system resources.
Many networks, such as the Internet, are designed for use with a network browser to enable navigation between network addresses. A browser is an application program or facility that normally resides on a user's workstation and which is invoked when the user decides to access network addresses. A prior art Internet browser program typically accesses a given network address according to an addressing format known as a uniform resource locator (URL). When a user selects a particular URL, the browser retrieves a web page associated with that URL. FIG. 1 illustrates an embodiment of a typical web page 100. Once the web page is downloaded to a display screen, the user can read the content 110 displayed on that web page.
Many web pages, however, contain “clutter,” such as, links 120 to other pages, menus 130 and/or advertisements 140 at the top of the page. If the user is uninterested in viewing the “clutter,” he can simply skim through them and navigate directly to the content 110 of the page by using the scroll bar 150, mouse (not shown), or keyboard (not shown).
However, sight-impaired users have difficulty navigating to the area of interest, e.g., the content 110, due to their inability to view the web page. In general, sight-impaired users browse the web using a web page reader, such as Home Page Reader (HPR) by International Business Machines, Inc. of Armonk, N.Y. HPR uses text-to-speech processing and reads aloud the content of a web page to the sight-impaired user through a set of speakers. HPR provides the sight-impaired user some tools to navigate through the page, such as, a “skip to the next paragraph” function and a “skip to the next sentence” function, etc. However, none of the tools allows the sight-impaired user to avoid hearing the “clutter” on the page and go directly to the content of the page that is of interest to the user.
Some efforts have been made by many consortium of web page designers to assist the sight-impaired users in dealing with this situation. One accepted convention is to place a hidden hyperlink near the top of the page that states, “skip to main topic.” This feature is helpful, but the sight-impaired user is still at the mercy of each web page designer to incorporate this feature into his web page.
Moreover, recently section 508 of the Rehabilitation Act Amendments of 1998 requires all United States federal agencies to make their information technology accessible to their employees and customers with disabilities. That is, all new IT equipment and services purchased by federal agencies must be accessible. This rule applies to all electronic equipment used in federal agencies (not just workstations). The law also gives federal employees and members of the public the right to sue if the government agency does not provide comparable access to the information and data available to people without disabilities. All state agencies that receive federal funds under the Assistive Technology Act of 1998 are also required to comply with section 508 requirements.
Therefore, there exists a need for improved methods and apparatus of reading web page content for sight-impaired users.