In recent years, mobile communication and computing devices using touch-sensitive displays, such as the ‘iPhone’™ from Apple Inc., have become commonplace. Users are typically able to directly manipulate graphically-depicted interactive elements on the user interface display by placing one or more fingertips in contact with the screen and making gestures such as tapping, sliding and pinching. Touchscreens typically comprise transparent, capacitance-sensing layers and, using well-known techniques, can sense the position of multiple simultaneous points of contact between a users fingers and the display surface. In terms of interaction with graphical interface elements, users can simulate typing on a displayed keyboard, select icons to open applications, select text fields for subsequent textual input and scroll through lists or other contents. With many such devices, users may even scroll an entire ‘home screen’ or ‘desktop’ that displays an array of icons that each represent an application to launch or a feature to invoke.
Touchscreen devices like the iPhone rely mainly upon the visual display and touchscreen to support user interaction and tend to provide minimal physical buttons or other input mechanisms for which a user could employ tactile sense to locate and actuate. This minimization of mechanical buttons makes for a heavily software-driven and graphically-oriented user interface that supports fingertip gestures. While this lends to versatility of the device, there is a practical limit on the number of gestures that are intuitive, easily remembered and readily distinguishable. As the possible gestures are dedicated to specific interactions, the gesture mappings become quickly exhausted. This is especially true when accessibility tools are layered on top of normally used touchscreen paradigms. Furthermore, where nearly every user interaction must take place via the touchscreen, a user who wants to freely alter some functional attribute of the device or an application must navigate through a menu hierarchy to reach a particular setting and is thus impeded from making momentary or dynamic changes to certain settings.
Vision-impaired users of such touchscreen devices may be unable to see user interface elements displayed on the screen, such as simulated keyboard keys, icons, buttons and the like. However, some accommodations have been introduced, such as Apple's ‘VoiceOver’ accessibility feature, so that sound effects or synthesized speech inform a vision-impaired user of content or controls that correspond to the position of the user's finger as they touch the screen. To support this, application developers add descriptive textual labels in their application's interfaces so that, ideally, each visual page or control element also has a corresponding textual description that can be announced to a user by speech synthesis. Consequently, even without seeing the displayed elements, a user can probe the display and elicit audible responses until finding a desired function or control or content.
In addition to having applications provide descriptive labels for the displayed elements, additional measures have been instituted to distinguish between a single-point touching gesture used by a blind user to explore the display and a similar single-touch that would normally signify intent to launch an application or act upon a control element, such as a displayed pushbutton control. As an example of this disambiguation, Apple's VoiceOver accessibility mode notably shifts the interpretation of touchscreen gestures.
When the ‘VoiceOver mode’ is active, the user's single-fingered input is assumed to be an attempt to only probe the environment and elicit descriptive sounds. Without this provision, a blind users attempt to merely explore the displayed icons could not be distinguished from an intent to invoke an application or otherwise act upon touchscreen-actuated visual elements.
In order for a user, during VoiceOver mode, to actually take action upon an element in the same way a single-tap gesture would normally work, the user must instead perform a ‘double-tap’, or in some instances a ‘triple-tap’. In other words, the user typically performs a preparatory exploration of the interface by touching the screen in various locations and hearing descriptive sounds for elements displayed under their fingertips. As various elements are contacted, a VoiceOver ‘cursor’ is shifted around to highlight the currently or most recently contacted element for which a sound was elicited. Once the VoiceOver cursor has been used to select a user interface element, the user may subsequently execute a double-tap gesture anywhere on the screen to activate the selected control. The double-tap gesture anywhere on the screen will perform the same action that a single-tap directed at the selected element would have performed if VoiceOver mode were not active. Thus, with VoiceOver mode active, single-touch gestures effectively become intercepted and used for exploratory interaction whereas double-tap gestures are, in effect, converted to single-tap gestures as if executed upon the element that is currently highlighted.
The VoiceOver mode also entails modification of other user inputs via the touchscreen. A scrolling action, typically performed by a single finger sliding in a scroll direction, also has to be disambiguated from the motion of simply sliding around to find displayed features without activating them. Accordingly, while in VoiceOver mode, scrolling is only engaged when three fingers come into contact with the screen. (Two-fingered gestures are already assigned to controlling page-wise reading).
While the above measures improve basic accessibility of touchscreen interfaces to blind or low-vision users, further improvements may be realized in terms of agility, ease of use, efficient use of available gesture combinations or in facilitating a common experience to be shared among both sighted and vision-impaired users.