In order to enable persons with disabilities to obtain accessible online experience websites must meet the requirements of accessibility standards such as WCAG 2.0. Web Content Accessibility Guidelines (WCAG) 2.0 covers a wide range of recommendations for making web content more accessible. Following these guidelines will make content accessible to a wider range of people with disabilities, including blindness and low vision, deafness and hearing loss, learning disabilities, cognitive limitations, limited movement, speech disabilities, photosensitivity and combinations of these. The websites must help all kinds of people with disabilities access to various public accessibility of information services. It has been observed that now a days a websites is created in manner such that users with disabilities can also access the websites without other assistance. In order to facilitate web accessibility to the users with disabilities, it is essential to assess the web accessibility of the websites before the websites is deployed on a server. Since a web-page of the websites contains a plurality of web elements of distinct behavior, one of the essential steps for assessing the web accessibility is to identify each web element correctly present on the web-page.
In addition to the basic web elements, there are some other web elements, hereinafter referred to as complex web elements, present on the web-page which cannot be identified based on the structure and the semantics of the HTML code. Examples of the complex web elements may include, but not limited to, a menubar, a treeview, and a captcha. It is to be understood that the complex web elements cannot directly inference from the HTML or the HTML DOM because the complex web elements do not have a defined structure or semantic of the HTML code. Hence, the complex web elements are interpreted as the basic web elements and are assessed in the same manner as the basic web elements are assessed by the traditional methods.
Recent trends in regulatory area, mainly amendment of laws in several countries regarding accessibility compliance, a huge momentum in accessibility work across globe has been noticed. A lot of work is going on for integrating accessibility into live IT products, solutions and websites. Typical activities in such assignment are assessing the accessibility compliance and identify the defects followed by issuer remediation, accessibility validation and final check.
The number of screens/pages in IT products or websites varies from few hundreds to few thousands. Accessibility assessment is primarily a manual work (70%), on an average single page assessment needs few hours' efforts (ranging 3 to 7 hours). Performing two rounds of testing for such a huge number of pages need considerable amount of efforts and intern cost.
To mitigate this challenge, IT industry widely use best practice/method of ‘sample assessment’, during first cycle of testing few sample screens/pages selected for testing which could give close to 100% coverage of applicable accessibility guidelines for a project. This help in reduction in efforts up to 60% in first testing cycle. But to find of the sample pages for testing from thousands of pages need human judgment and manual intervention, hence required extensive manual efforts to identify these pages resulting in high costs. Additionally, on some occasions, there are high possibilities that human may not be able to identify all the unique pages accurately with given complexity and constrains.
One of the existing method uses web page classification based on document structure. It proposes a method for classification of pages into three broad categories—information, research and personal. The aim is to get an estimate of the type of data available on a websites. Another method uses sampling of pages uniformly from the World Wide Web. It proposes two algorithms for generating randomly uniform sample set of web pages from the World Wide Web. The aim is to sample and index pages to be used by a search engines. Several attempts have been made to categorize the web pages with varying degree of success. None of the method have been convincing enough to be used for assessing the accessibility of websites.