In recent years, mobile communication and computing devices using touch-sensitive displays, such as the ‘iPhone’™ and iPad™ products from Apple Inc., have become commonplace. Users are typically able to directly manipulate graphically-depicted interactive elements on the user interface display by placing one or more fingertips in contact with the screen and making gestures such as tapping, sliding and pinching. Touchscreens typically comprise transparent, capacitance-sensing layers and, using well-known techniques, can sense the position of multiple simultaneous points of contact between a user's fingers and the display surface. In terms of interaction with graphical interface elements, users can simulate typing on a displayed keyboard, select icons to open applications, select text fields for subsequent textual input and scroll through lists or other contents. With many such devices, users may even scroll an entire ‘home screen’ or ‘desktop’ that displays an array of icons that each represent an application to launch or a feature to invoke.
Touchscreen devices like the iPhone and other so-called ‘smartphones’ rely mainly upon the visual display and touchscreen to support user interaction and consequently provide minimal physical buttons or other input mechanisms for which a user could employ tactile sense to locate and actuate. This minimization of mechanical buttons makes the user interface heavily software-driven and graphically-oriented. In some cases, however, as the finite number of gestures that are intuitive, easily remembered and readily discernible are dedicated to specific interactions, the gesture mappings become quickly exhausted. This is especially true when accessibility tools are layered on top of normally used touchscreen paradigms. Furthermore, where nearly every user interaction must take place via the touchscreen, a user who wants to freely alter some functional attribute of the device or an application must navigate through a menu hierarchy to reach a particular setting and is thus impeded from making momentary or dynamic changes to certain settings.
Blind users of such touchscreen devices are unable to see user interface elements displayed on the screen, such as simulated keyboard keys, icons, buttons and the like. However, some accommodations have been introduced, such as Apple's ‘VoiceOver’ accessibility feature, so that sound effects or synthesized speech inform a blind user of content or controls that correspond to the position of the user's finger as they touch the screen. To support this, application developers add descriptive textual labels in their application's interfaces so that, ideally, each visual page or control element also has a corresponding textual description that can be announced to a user by speech synthesis. Even without seeing the display, a user can probe the display and elicit audible responses until finding a desired function or control or content.
In addition to applications providing descriptive labels for the displayed elements, additional measures have been instituted to discriminate a single-point touching gesture used by a blind user to explore the display from a similar single-touch that would normally signify intent launch an application or act upon a control element, such as a displayed pushbutton control. As an example of this disambiguation, Apple's VoiceOver accessibility mode notably shifts the interpretation of touchscreen gestures.
Normally, in the case where a sighted user wishes to launch an application, the user locates a corresponding icon on the home screen, selected based on the icon's visual appearance suggesting its function, and then simply taps the icon once with their fingertip. The ‘tap’ gesture is easy to directly target with one's finger given the size and spacing of the icons.
When the ‘VoiceOver mode’ is active, however, the user's single-fingered input is assumed to be an attempt to probe the environment and elicit descriptive sounds. Without this provision, a blind user's attempt to merely explore the displayed icons could not be distinguished from an intent to invoke an application or otherwise act upon touchscreen-actuated visual elements. In order for a user, during VoiceOver mode, to actually take action upon an element in the same way a single-tap gesture would normally work, the user must instead perform a ‘double-tap’. To be more specific, the user typically performs a preparatory exploration of the interface by touching the screen in various locations and hearing descriptive sounds for elements displayed under their fingertips. As various elements are contacted, a VoiceOver ‘cursor’ is shifted around to highlight the currently or most recently contacted element for which a sound was elicited.
Once the VoiceOver cursor has been used to select a user interface element, the user may subsequently execute a double-tap gesture anywhere on the screen to activate the selected control. The double-tap gesture anywhere an the screen will perform the same action that a single-tap directed at the selected element would have performed if VoiceOver mode were not active. Thus, with VoiceOver mode active, single-touch gestures effectively become intercepted and used for exploratory interaction whereas double-tap gestures are, in effect, converted to single-tap gestures as if executed upon the element that is currently highlighted.
The VoiceOver mode also entails modification of other user inputs via the touchscreen. A scrolling action, typically performed by a single finger sliding in a scroll direction, also has to be disambiguated from the motion of simply sliding around to find displayed features without activating them. Accordingly, while in VoiceOver mode, scrolling is only engaged when three fingers come into contact with the screen. (As will be described below, two-fingered gestures are used to control page-wise reading.)
While the above measures provide basic accessibility of touchscreen interfaces to blind or low-vision users, further improvements may be realized in terms of agility, prudent assignment of gestures to functions and ease of use, as well as in facilitating a common experience to be shared among both sighted and blind users.