When software programs are designed and developed, typically some effort is made in an attempt to make the resulting program easy and convenient to use. For example, programs should be logically arranged and configured, from the user's perspective, such that when the program is released, the user's time can be spent primarily on working with the program. This quality of the program is generally referred to as the usability of the program, where a high usability is generally regarded as a desirable feature.
It is however not practically possible to completely evaluate the program's usability during the development stage. While experiences with other software programs or earlier versions of the same program may give valuable indications on the best design, there are typically some characteristics in the final product (good or bad) that were not foreseeable at the earlier stages. For this reason, it is common to perform test sessions where users try the proposed software product by performing one or more of its predefined tasks. Typically, such test sessions are monitored by test personnel, or automatically by recording equipment, so that the test session can later be evaluated. The software product may also be tested to determine whether it performs the functions that it was created to perform, and that it operates reliably. Usability testing is typically performed when such more basic testing has been completed.
During the test sessions, users interact with the software program using any or all equipment that is commonly associated with computers. For example, the software may require the user to enter information on a keyboard, manipulate screen items using a mouse or other pointer device, and read information on a graphical user interface (GUI) presented on a display device. Each of these operations takes a finite amount of time. Thus, the time taken by any user to complete the test session may be seen as an accumulation of time spent reading the GUI, entering text with the keyboard, using the mouse, moving the hand between the keyboard and the mouse, talking with the test instructor or receiving instructions, etc. The latter category represents time that is to be extracted from the total measured time so that it does not skew the test results. The former categories represent time that may be a relevant indicator of usability.
There exists predictive models for evaluating software. One predictive model is the so called Goals, Operators, Methods and Selection rules (GOMS) model that allows calculation of a “theoretical” time for performing a given sequence of inputs. That is, the GOMS model assigns a fixed time value to many operations that the user can perform, from a single keystroke or mouse click to moving the hand between the keyboard and mouse. If information defining a predetermined sequence of keystrokes, hand movements and mouse operations is applied to such a GOMS model, it can generate a theoretical prediction of the time required for a user to perform the inputs. The outputs generated by existing solutions for testing software products do not integrate these predictions with the test results. Also, it is typically a complicated procedure to enter into the predictive model the information defining the predetermined sequence of inputs executed by the user.