The present invention deals with generating content, such as help content. More specifically, the present invention deals with automatic capturing of images indicative of actions of a user on a user interface.
The Graphical User Interface (GUI) is a widely used interface mechanism. GUI's are very good for positioning tasks (e.g. resizing a rectangle), visual modifier tasks (e.g. making something an indescribable shade of blue) or selection tasks (e.g. this is the one of a hundred pictures I want rotated). The GUI is also good for speedy access to quick single step features. An application's GUI is a useful toolbox that is organized from a functional perspective (e.g. organized into menus, toolbars, etc) rather than a task oriented perspective (e.g. organized by higher level tasks that users want to do, such as “make my computer secure against hackers”).
However, GUIs present many problems to the user as well. Using the toolbox analogy, a user has difficulty finding the tools in the box or figuring out how to use the tools to complete a task composed of multiple steps. An interface described by single words, tiny buttons and tabs forced into an opaque hierarchy does not lend itself to the way people think about their tasks. The GUI requires the user to decompose the tasks in order to determine what elements are necessary to accomplish the task. This requirement leads to complexity. Aside from complexity, it takes time to assemble GUI elements (i.e. menu clicks, dialog clicks, etc). This can be inefficient and time consuming even for expert users.
One existing mechanism for addressing GUI problems is a written help procedure. Help procedures often take the form of Help documents, PSS (Product support services) KB (Knowledge base) articles, and newsgroup posts, which fill the gap between customer needs and GUI problems. They are analogous to the manual that comes with the toolbox, and have many benefits. These benefits include, by way of example:                1) Technically speaking, they are relatively easy to author even for non-technical authors;        2) They are easy to update on a server so connected users have easy access to new content; and        3) They teach the GUI thereby putting users in control of solving problems.        
However, Help documents, PSS KB articles and newsgroups have their own set of problems. These problems include, by way of example:                1) Complex tasks require a great deal of processing on the user's part. The user needs to do the mapping from what is said in each step to the GUI. This can lead to errors in that steps are skipped, described incorrectly or inadequately or are described out of order.        2) Troubleshooters, and even procedural help documents, often include state information that creates complex branches within the help topic, making topics long and hard to read and process by the end user. Toolbars may be missing, and may need to be turned on before the next step can be taken. Troubleshooters often ask questions about a state that is at best frustrating (because the troubleshooter should be able to find the answer itself) and at worst unanswerable by non-experts.        3) There are millions of documents, and searching for answers involves both a problem of where to start the search, and then how to pick the best search result from the thousands returned.        4) There is no shared authoring structure. Newsgroup posts, KB articles, troubleshooters and procedural Help documents all have different structures and authoring strategies, yet they are all solving similar problems.        5) For a user, it is simply difficult to read step-by-step text, and then visually search the UI for the element being described and take the action described with respect to that element.        
Another existing mechanism for addressing GUI problems is a Wizard. Wizards were created to address the weaknesses of GUI and written help procedures. There are now thousands of wizards, and these wizards can be found in almost every software product that is manufactured. This is because wizards solve a real need currently not addressed by existing text based help and assistance. They allow users to access functionality in a task-oriented way and can assemble the GUI or tools automatically. Wizards allow a program manager and developer a means for addressing customer tasks. They are like the expert in the box stepping the user through the necessary steps for task success. Some wizards help customers setup a system (e.g. Setup Wizards), some wizards include content with features and help customers create content (e.g. Newsletter Wizards or PowerPoint's AutoContent Wizard), and some wizards help customers diagnose and solve problems (e.g. Troubleshooters).
Wizards provide many benefits to the user. Some of the benefits of wizards are that:                1) Wizards can embody the notion of a “task.” It is usually clear to the user what the wizard is helping them accomplish. With step-by-step pages, it can be easy for a user to make choices, and in the case of well designed wizards the incidence of the user becoming visually overwhelmed is often reduced.        2) Wizards can automatically assemble and interact with the underlying features of the software and include the information or expertise needed for customers to make choices. This saves the user time in executing the task.        3) Wizards can automatically generate content and can save users time by creating text and planning layout.        4) Wizards are also a good means for asking questions, getting responses and branching to the most relevant next question or feature.        
However, wizards too, have their own set problems. Some of these problems include, there are many more tasks that people try to accomplish than there are wizards for accomplishing them. Wizards and IUI (Inductive User Interfaces) do not teach customers how to use underlying GUI and often when the Wizard is completed, users are unsure of where to go next. The cost of authoring of wizards is still high and requires personnel with technical expertise (e.g. software developers) to author the Wizard.
Thus, authoring all of these types of content that describe procedures to be taken by a user, is often error prone. It is quite easy to miss steps, to describe steps incorrectly, or to lose track of what step is currently being described in a long sequence of UI manipulations. However, this written procedural help content is extremely common. Such help content often ships with products, on-line help content is provided for product support teams, and procedures inside companies are often documented in this way for specific business processes. Thus, this type of information is difficult to author and often contains errors.
In addition, end users must typically follow the steps that have been authored. It can be difficult to read step-by-step text, and then search the UI for the particular control element being described and then to take the proper action with respect to that control element. It has been found that many users find this such a burden that they simply scan the first one or two steps of the text, and then try their best to determine which UI elements need to be actuated next, barely referring back to the written text steps. It has also been found that the eye can find and recognize pictures much more easily than it can read a word, mentally convert the word into a picture, and then find the corresponding UI control element. Yet, in the past, this is exactly what was done, as an author must painstakingly take screenshots of each step, crop the images, and paste them into a document in the right place, in order to have any type of visual depiction of an action to be taken.