Modern computer systems are often equipped with speech recognition technology. In a typical system, a computer user opens a graphical application window and starts typing or dictating input such as text, numerical information or drawing. In order to ensure accuracy, the user often starts organizing information and checking the facts. The user may open files on his or her computer or perform searches on the Internet, read and page through the information obtained, and follow relevant links by clicking on them. As the computer user writes the document, he or she will toggle between the various graphical windows and page through them, trying to find and check information. While working, the user may also refer to the help provided by his or her application to assist in formatting the work.
However, as the user opens other application windows, the first or primary graphical application window becomes partially or totally obscured by one or more other application windows that the computer user later opens. The user may use the mouse or other pointing device to position and size some or all of these application windows, but every time he or she opens another application it needs to be resized. Trying to view and copy and paste the information content from various files into a working application window often becomes awkward and frustrating as each window obscures other windows.
Another issue with current windowing systems is that the user must explicitly switch the “input focus.” A window is said to have “focus” when it is active and currently designated to receive the user input from the keyboard, mouse, or from a voice recognition system. A computer user must switch back and forth between the first application window and the other application windows, in order to control or manipulate the activity or screen displayed in any graphical application window. Whether switching the focus is achieved manually (e.g., by keyboard or mouse) or by oral command, switching focus back and forth is a nuisance and easily creates frustration for the user.
For example, suppose a user has his or her primary graphical application window and a second application window displayed simultaneously on a computer screen, each in a separate window, with input focus in the first application window. When the user needs to page down in a second application window's displayed text, the user typically must move the mouse cursor to the second application window's display, click the mouse to move input focus, press the page down key, move the mouse cursor back to the original location in the first application window, and finally click the mouse to restore the input focus. This requires four conscious steps requiring some dexterity to perform smoothly, thereby interrupting the user's attention. It is also very common for the user to forget to move input focus back to the first application window until reminded by the failure of the first application to respond to new keyboard input, causing frustration and lost time.
Another source of frustration and difficulty in modern graphic user interfaces (GUIs) is overlapping windows. Overlapping windows result in the obstruction of a user's view of his or her primary graphical application window, and the loss of focus and attention to the first window application are issues that continue to plague prior art. Many methods have been developed to address these difficulties. For instance, U.S. Pat. No. 5,715,415 (the '415 patent) provides users with “automatic tiling” and retention of focus in the application as the user clicks into their built-in help systems. However, the '415 patent is limited to working only with help systems, with window panes, and with built-in help systems. The '415 patent additionally shares another weakness with other help systems. Virtually all modern help systems provide hyperlinks to web pages, so that their users can find more complete and recent information. Typically, as soon as a computer user clicks on a hyperlink to access the web, an internet application window such as Microsoft's Internet Explorer opens, partially or totally obscuring the user's first application and causes the user to lose focus from his or her graphical application window.
U.S. Patent Application No. 2002/0130,895 (the '895 patent application) partially solves this problem by simultaneously displaying portions of a web page next to help information the user requested. However, the '895 patent application only works with help information and still causes the user to lose focus from his or her first application. The user is again confronted with the inconsistencies and frustrations pursuant to the shifting of computer focus between first application and content. These problems are exacerbated by the frequency with which the user needs to click into the Internet application to follow links.
U.S. Pat. No. 5,699,486 (the '486 patent) provides an alternative approach that provides help content via visual help or aural speech. If the user requests help via aural speech, the '486 patent makes it possible for the computer user to receive help while simultaneously viewing his or her first application window. A limitation with the '486 patent approach is that aural speech simply disappears after being spoken and must be remembered by the user. The '486 patent has the additional limitation that help for complex situations is much harder to understand and often fail to result in a useful action when received aurally, compared to being arrayed on a display screen. The '486 patent does not provide a method to prevent the user from losing focus every time the user needs to click a hypertext link to play back the requested help. Also, the '486 patent does allow a user to view help as well as hear it, but, when the help is displayed visually, the '486 patent offers no method to prevent the user's primary application from being obstructed by the help displayed. The '486 patent is further limited since it only works with help systems.
Typical prior art voice recognition technology merely substitutes voice commands for keyboard and mouse commands and has not solved these issues of focus and obstruction of the computer user's view of his or her first application, often making it more difficult for a user to control his or her application windows by voice than by using a mouse or keyboard. Specifically, if a computer user of voice recognition software wants to display a help system window, Internet application window or other information content next to his or her working application window so that the windows don't obscure each other's content, the user would have to individually position and size each window by issuing many voice commands to grab the borders of the windows and move and drag them by voice.
A user of prior art voice recognition software working an application window needs to issue three or four commands to follow links in a secondary graphical application window displaying Internet or other content. For example:
First, the user would say, “Switch to next window” and the voice recognition engine will cause the windowing system to switch to the next window.
Second, the user would say, “Products” and the voice recognition engine will cause the Html application window to follow any link that is named “Products” or has “products” in its name, e.g., “New Products.” But, if there are two or more links on a displayed page that contain the spoken name, the speech recognition product will number or otherwise uniquely identify the links, forcing the user to issue another command to identify the link he/she wants, for instance:
Third, the user would say, “Choose two.” As an alternative to these last two commands, at least one current voice product allows a user to issue a command to number all the links on a displayed page and then the number of the link desired.
And fourth, the user would say, “Switch to next window” and the computer user will be returned to his or her first application window.
Similarly, if a user of prior art voice recognition software wants to switch to another open application on his or her desktop to copy text into the user's first application, it takes at least 5 commands. For instance:
First, the user would say, “Switch to Internet Explorer” and the voice recognition engine might cause the windowing system to switch to the next window. But it may not. Switching between windows by voice is problematic with prior art voice recognition systems when there are many application windows open on the desktop. If there are several copies of Internet Explorer open, the computer user needs to dictate, “Switch to” followed by the exact name of the document e.g., “United States Patent and Trademark Office Home Page.” But, how is the user expected to say the name of the document when the user won't be able to see it because the document names are partially or completely hidden on the taskbar when several application windows are open.)
Second, the user would dictate, “Select united through page,” and this would cause the voice recognition system to select text on the page.
Third, the user would dictate, “Copy that.”
Fourth, the user would say, “Switch to Previous Window” and the computer user will be returned to his or her first application window.
And fifth, the user would dictate, “Paste that” and the previously copied text would be pasted at the cursor position.
Prior art voice recognition help systems also use traditional help with hide/show panes, and search indexes. Thus, they share the weaknesses of other prior art: they cover up part or all of the computer user's application windows, they force the user to switch focus back and forth between the help system and his or her application, and they force the computer user to go through a labyrinth of menus and submenus. For example, to access help about formatting the user needs to follow these steps:
Dictate “What can I say?” or Press “F1”
Dictate or click “Show”
Dictate or type in the word(s) to search for: “format text”
Dictate or click “Select topic”
Dictate “Move down 15” (or more realistically dictate, “Move down 20, then “Move up 5”) to move the cursor down 15 lines.
Dictate or click “Hide”
Dictate “Switch to Next Window” (to return to his or her original application) or click on the original application
What is needed is system that overcomes these shortcomings of the prior art.