The inventors are aware that the graphical user interface (GUI) as a paradigm for human interaction with computers and other devices is well established. It requires visually communicating a selected part of the computer's changing state to the human user, while tracking certain aspects of the human bodily motion and interpreting them as controls to the computer.
The limitations on the speed, accuracy and amplitude of human movement have been widely studied using a constraint called Fitts' Law. In the field of HCI it puts a lower limit on the time required by a human to complete a movement over a known distance A to a target of known size W. The central claim of Fitts' Law is that the ratio A/W is the sole determinant of an index of difficulty, which in turn predicts the minimum movement time.
Even in the simple tasks considered by Fitts (reciprocal tapping, disc transfer and pin transfer), an additional geometrical constraint has to be assumed for the tasks to even make sense: W has to be smaller than A/2, otherwise no movement is necessary, since the target has already been reached.
Human capacity is thus not the only constraint on this type of problem, especially when a large number of targets are presented for user selection, as is done in the typical GUI. With many objects of interest competing to be selected, the geometrical constraints proliferate and grow in importance. They stem from the desire to avoid ambiguity, and therefore to disallow the overlap of objects. In the simple case of equally sized objects, the common size W has to be smaller than half the distance between the pair of nearest neighbours in the space, echoing Fitts' geometrical constraint.
The pointing advantage of any object may be increased by either increasing its size, or bringing it closer to the current position of the user's pointer. However, these two strategies clash when the pointing advantage of many objects have to be increased, due to the above-mentioned geometrical constraints. If the objects are tiled in space without overlap, any increase in size has to be compensated for by moving the objects further apart.
This inverse relationship between object sizes and inter-object distances is quite general, and a trade-off needs to be made between rate of navigation and ease of acquisition. The space available to accommodate GUI objects is always limited, either by the hardware used or by the maximum extent of human vision and movement.
The Functional Physical Subspaces of Human-Computer Interaction (HCI)
The visual communication during HCI takes place via an optical projection from a display space, functionally created by the computer display device's ability to generate light, into a visual space, functionally created by the ability of the user's eye to see. The light-sensitive retina is a surface, and the projected image of the visual space on it is two-dimensional. The two spatial coordinates of a point in visual space thus detectable by one eye may be interpreted as angles of incidence, leaving distance to be determined by other means. Each eye has its own visual space, although the visual spaces of one person's two eyes usually overlap substantially.
Sports arenas, shop windows and theatrical stages are inherently three-dimensional display spaces, but like the retina, computer display devices are based on surfaces. Computer synthesis of a three-dimensional display space may be achieved by using two different but coordinated display surfaces, one aimed at each of the two eyes. The mental synthesis of a three-dimensional cognitive visual space is somehow accomplished by the human brain. Visual objects are mostly perceived to be “out there” in visual space and not connected to the eye.
The tracking of human movement during HCI takes place in the intersection between a motor space, functionally created by the user's ability to move physical things, and a control space, functionally created by the computer input device's tracking range or freedom to be moved. These two spaces are primarily three-dimensional, although the action is usually transferred across a curved contact or control surface bounding the input device. Also, the device is often purposely constrained in such a way that the control space is reduced to a limited extent in one or two dimensions only. Action is mostly perceived to take place “right here” in motor space.
Each of the four private spaces mentioned above is a bounded and changing subspace of the same public physical space shared by the user, the computer, and things in their environment. The private spaces may overlap to varying degrees at different times. Two spatial intersections strictly needed for HCI have already been alluded to above, but others are optional. For example on the human side, we may see what is out of reach or move what is not visible. On the computer side, the display and control spaces may be disjunct (e.g. screen and mouse), they may overlap partially (e.g. split screen) or they may overlap completely (e.g. touch screen).
Motor and Visual Advantages
The healthy human body has very specific and limited motor abilities, of interest to the student of physical ergonomics. These limitations affect the possible speed and extent of movement, the amount of force that can be exerted, the number of repetitions that can be performed, the precision of control, etc. The designer of a tool must adapt its control surfaces and methods to human abilities, not only to enable effective and intuitive use, but also to lower the risk of error and to counter fatigue.
A tool may provide its user with a motor advantage for a certain task, by changing some aspect of the movement required or the properties of a control surface. Levers are widely used in cutting tools for example, to provide the mechanical motor advantage of lowering the required force. Any such advantage comes at a cost, in this case of having to enlarge the extent of movement proportional to the amount of leverage provided. In addition, what counts as a motor advantage depends on the user and the details of the task. In eye surgery, the enlarged extent of movement required by a lever may be exactly what is desired for more precision in performance, and precision is more problematic in this case than force. Motor advantage may be fixed (e.g. scissors), adjustable (e.g. bicycle gears) or variable (e.g. continuously variable transmission).
Human perceptual limitations are also important for the design and use of tools. A visual advantage may be provided for a certain task, by changing what is visible about the tool and its use. Optical lenses are employed in many tasks to provide the visual advantage of magnification or better focus. Again the advantage has a cost, in the case of magnification, the diminished field of view. In addition, a tool may provide the visual advantage of making its affordances visible.
Motor and visual advantages are often closely associated and they may affect each other via feedback. But they are not the same and they should be considered independently. A watchmaker using a magnifying glass for enlargement does not gain a direct motor advantage from it, because it stretches the display space but not the control space. Indirectly of course, better visual feedback may improve performance, but only to the extent that closing the control loop enables more accurate movement already within the motor ability of the watchmaker.
Dynamic Allocation of the Available Functional Space
We assume that the computer has no direct control over the private human motor and visual spaces, which may be shifted and directed (if not extended) at will by their owner. The private computer control and display spaces are therefore the only functional spaces whose use may be optimized for interaction by the system architect, by creating motor and visual advantages.
The allocation of portions of the available display space to computer display elements (conferring visual advantage), and of portions of the available control space to computer control functions (conferring motor advantage) is an important and recurring problem of HCI design. The usable extent of both spaces is limited, while the amount of content accessible or creatable on many devices is so large as to be practically infinite. Thus, no exhaustive, fixed allocation of space is possible—the allocation has to be dynamic: either discretely adjustable or continuously variable.
Many discrete and continuous ways have been invented to perform linear dynamic computer space allocation in two dimensions, including virtual paging, scrolling, zooming, panning, and rotation. Non-linear space allocation like distortion has also been used. Navigation of a virtual three-dimensional space amounts to a linear method as well, and non-linear allocation is possible in three dimensions too.
A different approach to solving the dynamic space allocation problem is based on hierarchical methods. In two dimensions this includes the use of potentially overlapping windows that may be displayed and navigated recursively. Other traversal methods for large hierarchical data sets include ring trees, cone trees, beam trees, botanical trees and hyperbolic browsing.
The C-D Function
The dynamic space allocation methods are ordinarily designed to change the interactive use of display space and control space in exactly the same way, modifying their allocation in tandem over time. The tight coordination between the spaces is generally sufficient for the user to maintain a coherent mental picture and grasp of the interaction that is in progress. This human-experienced connection between computer control and display is the result of potentially very complex processing by the computer in converting input to output. Interactive aspects of the input/output relation are commonly measured by the control-display or C-D function. It is similar to mechanical leverage and is useful in the description of as simple a device as an oscilloscope, where the C-D gain or ratio is a discretely adjustable amplification factor.
It is not known exactly what constitutes a good or optimal C-D function. It clearly depends on the user, and probably on the task as well. Proportionality is sufficient in many cases, but is it necessary? Should the function be constrained to be an isomorphism between control space and display space? How much delay can the user tolerate (>50 ms)? Can the C-D function change with time? Are discontinuities always jarring? Early views were that the C-D function had to be simple in order not to confuse the user, but it turns out that users may tolerate and benefit from a substantial amount of non-linearity and hysteresis.
Dynamic space allocation and designing a good C-D function are related, and they may simply be two perspectives on the same problem.
Linking and Separation of Control and Display
The pointer and icons of the GUI are all objects in the computer's two-dimensional display space. The pointer object is a visual aid for the user to identify a single point in display space, namely the pointer position or user point. It is meant to be the visual focal point of the user, but it also represents the motor locus of action in the simplified world of the GUI, where action is controlled by a single point. The GUI act of icon selection for example, is made dependent on a relation which is made apparent in display space, namely, which icon's area contains the pointer position at the time of a click. Both the making dependent and the making apparent are achieved by computer processing, and they may be regarded as different aspects of one complex C-D function.
The pointer position is the point in display space where the GUI's control and display operations are most tightly coupled. By changing the C-D function, this coupling may be relaxed at other points, preferably without incurring a disastrous disconnect between the user and the computer. We therefore consider the possibilities of the separation of computer control and display spaces, parallel to the separation of user motor and visual advantages. The former separation will be referred to as control-display decoupling.
Dynamic control-display decoupling may be achieved e.g. by applying a distortion to the display, and making the distortion move with the user point. From the user's point of view, dynamic control-display decoupling may be detected in the following way. Even though an object on the display can be seen to include some particular point at one time, if the user point is moved to that point (without any clicking or dragging), the object no longer contains the point in question at the time the user point reaches it. An example of the application of dynamic control-display decoupling may be found in the Mac OS Dock in its animated version.
Intuitive thinking about objects in display space can be misleading about interaction. When the size of an object increases in display space, it is easy to assume that there is more control space available to manipulate it as well. We saw that this is not even true when looking at real objects through a magnifying glass. The relevant human movement takes place in the unchanged control space, not in the enlarged display space. But in the case of HCI it may be worse, because the C-D function can take almost any causal form. The two spaces are connected only by the relative physical arrangement of the input and output devices, and some computer processing. Coordination in allocation of the spaces is synthetic in the virtual world, not automatic as it would be in the (undistorted) real world.
The potentially puzzling failure of the Mac OS animated dock to derive motor advantage from non-linear distortion may be understood in these terms. The task is selection of one among many icons on a bar of fixed length and the method includes enlarging the display size of the icons closest to the pointer, while decreasing the size of those further away to make room. It is an application of a one-dimensional fish-eye distortion to display space, where the distortion function is centred on the pointer position and is moved around with the pointer. This creates a continuously variable visual advantage for the user, but no direct motor advantage. The method is comparable to using a magnifying glass on display space while the action proceeds in the unchanged control space.
The pointer position itself is not affected by the distortion, because the pointer position is always made the fixed point of the distortion function, even as it moves. The display space thus remains undistorted at the focal point, and consequently the icon overlapping the pointer position is also undistorted at that point. The distortion therefore never changes the decisive spatial relation between pointer and icons at the locus of action, which is the only point that matters for motor advantage. It is thus possible to have dynamic control-display decoupling without any motor advantage.
This remains true even though icon areas and pointer position may all continually change due to changing user input, whether in response to visual feedback or not. The changes in locus of action and distortion are correlated, because both are determined by the same pointer position, but that does not imply causation either way.
It would be easy to define a distortion function that displaced the pointer. However, positive feedback and instability would result if the pointer were displaced while the link between focal point and locus of action were kept intact. User sense of control may be lost, and it is not clear whether any motor advantage can be achieved in this way. On the other hand, the mentioned link seems essential for maintaining coherent interaction, at least in the GUI style.
Prior Art in Creating HCI Motor Advantage
Allocation of more control space to a computer control function gives the user a certain motor advantage in using that function. For example, making a button larger in control space can make its selection quicker and easier. The added space would normally need to be contiguous with the function's current control space to make sense. With a finite control space, the added space has to be either previously unallocated or taken away from another function. Linear methods like zooming and panning of the coupled control and display spaces may be the simplest and most intuitive way to gain an increase in control space for the items remaining or newly appearing on the display. However, treating the control and display spaces as separate or separable, opens up new ways of gaining motor advantage via control-display decoupling.
Dealing with the decoupling of display and control may be difficult at first, but it offers the opportunity of doing things that would otherwise be impossible. We will survey the state of the art in creating motor advantage, and then propose new improved methods.
Various types of motor advantage have been obtained where the control and display spaces are decoupled, and they all seem to involve memory or hysteresis of some kind. The current state of the art appears limited to the following types: (i) Time-dependent non-linear C-D functions, such as pointer acceleration based on speed, on distance traveled in control space or on time elapsed since the movement started. (ii) Semantic Pointing, which consists of adapting the C-D function in a discrete way, based on computer-internal feedback from the display space, indicating every time that the pointer has crossed a boundary between two regions whose meanings are distinct in some way. (iii) Object Pointing, which treats regions of display space not allocated to any object as something to be skipped by the pointer, without the need for a corresponding control movement. (iv) An improvement on the Mac OS dock, where the direction of motion of the cursor on entry into the bar is used to determine the focus point of the regular fish-eye distortion, which distortion is then held fixed for a second or so, after which the motor advantage lapses again.
Many methods have been developed to make the size—distance trade-off that redirects the motor advantage, or to use time to have the same space serve different objects. The most straight-forward of these are variations on scrolling, where the size remains fixed, and objects lost on one side are replaced by other gained on the opposite side. Paging may be viewed as a discrete way of scrolling. During zooming a single magnification factor is applied around a focus point as anchor, to change the size, but to lose objects on the periphery as they move away. These are the linear options.
Non-linear strategies are known as distortion or fisheye views, where variable magnification (and compression) is used to make the trade-off. Simply providing visual magnification does not equate to greater ease of selection however, as can be seen in the Apple Dock. Navigation through a distorted view is called focus targeting, and it presents the difficulty that apparent movement of objects due to moving the focus reaches a maximum at the focus point, where the magnification is the largest. This is another guise in which the size—distance trade-off appears.
Recently, methods have been developed to interactively control the trade-off, as in the technique called orthozoom scrolling and in speed-dependent auto-zoom (SDAZ). These methods improve the speed and ease of interaction.
We thus find in the art the following approaches to the size—distance trade-off:                Distortion methods that implement control-display decoupling, but provide no pointing or motor advantage, because only the display is distorted (Mac OS Dock, animated)        Methods based on the linear transformations of translation and scaling using various types of control, which provide the shifting of the pointing advantage (scrolling, paging, zooming, space traversal, such as dasher) and a variable amount of pointing advantage (OrthoZoom Scroller, Zoomslider) These methods do not include control-display decoupling.        Distortion methods that are speed-controlled (speed-dependent auto-zoom) or time-controlled (thweel).But there appears to be no method that provides all of the following:        Position dependent control        A variable pointing advantage under direct user control        The context of a distortion based GUI with control-display decoupling.        
The potential advantages of such methods include increased interaction speed and efficiency, interaction with a large number of objects at the same time, and improved user satisfaction.
So there remains a need in the art for a method of position dependent control of the trade-off between rate of navigation and ease of acquisition in a distortion based graphical user interface. The current invention addresses that need to give the user dynamic control over the layout of the GUI.