The 1990's decade has been marked by a technological revolution driven by the convergence of the data processing industry with the consumer electronics industry. This advance has been even further accelerated by the extensive consumer and business involvement in the Internet over the past few years. As a result of these changes it seems as if virtually all aspects of human endeavor in the industrialized world require human/computer interfaces. There is a need to make computer directed activities accessible to people who, up to a few years ago, were computer illiterate or, at best, computer indifferent.
Thus, there is continuing demand for interfaces to computers and networks which improve the ease of use for the interactive user to access functions and data from the computer. With desktop-like interfaces including windows and icons, as well as three-dimensional virtual reality simulating interfaces, the computer industry has been working hard to fulfill such interface needs by making interfaces more user friendly by making the human/computer interfaces closer and closer to real world interfaces, e.g. human/human interfaces. In such an environment it would be expected that speaking to the computer in natural language would be a very natural way of interfacing with the computer for even novice users. Despite these potential advantages of speech recognition computer interfaces, this technology has been relatively slow in gaining extensive user acceptance.
Speech recognition technology has been available for over twenty years, but it has only been recently that it is beginning to find commercial acceptance, particularly with speech dictation or "speech to text" systems such as those marketed by International Business Machines Corporation (IBM) and Dragon Systems. That aspect of the technology is now expected to have accelerated development until it will have a substantial niche in the word processing market. On the other hand, a more universal application of speech recognition input to computers, which is still behind expectations in user acceptance, is in command and control technology wherein, for example, a user may navigate through a computer system's graphical user interface (GUI) by the user speaking the commands which are customarily found in the systems menu text, icons, labels, buttons, etc.
Many of the deficiencies in speech recognition, both in word processing and in command technologies, are due to inherent voice recognition errors due in part to the status of the technology and in part to the variability of user speech patterns and the user's ability to remember the specific commands necessary to initiate actions. As a result, most current voice recognition systems provide some form of visual feed-back which permits the user to confirm that the computer understands his speech utterances. In word processing, such visual feedback is inherent in this process since the purpose of the process is to translate from the spoken to the visual. That may be one of the reasons that the word processing applications of speech recognition has progressed at a faster pace.
The above-referenced copending patent applications are directed toward making voice or speech command technology more user friendly and easier to use. Two of the applications are directed toward voice recognition systems and methods which interpret spoken inputs which are not commands, e.g. queries such as help queries and visual feedback present or prompt the user with lists of displayed proposed commands from which the user will, hopefully, find the appropriate command and then speak that command to the apparatus to initiate a desired action. "SPEECH COMMAND INPUT RECOGNITION SYSTEM FOR INTERACTIVE COMPUTER DISPLAY WITH INTERPRETATION OF ANCILLARY RELEVANT SPEECH QUERY TERMS INTO COMMANDS", Scott A. Morgan et al. (Attorney Docket No. AT9-98-343) is directed toward the provision of a relevance table including a basic active vocabulary provided by collecting from a computer operation--including the operating system and all significant application programs--all words and terms from menus, buttons and other user interface controls including the invisible but active words from currently active application windows, all names of macros supplied by the speech system, the application and the user, names of other applications that the user may switch to, generic commands that are generic to any application and any other words and terms which may be currently active. This basic active vocabulary is constructed into a relevance table, wherein each word or term will be related to one or more of the actual commands and, conversely, each of the actual commands will have associated with it a set of words and terms which are relevant to the command. "SPEECH COMMAND INPUT RECOGNITION SYSTEM FOR INTERACTIVE COMPUTER DISPLAY WITH MEANS FOR CONCURRENT AND MODELESS DISTINGUISHING BETWEEN SPEECH COMMANDS AND SPEECH QUERIES FOR LOCATING COMMANDS", Scott A. Morgan et al. (Attorney Docket No. AT9-98-344) is directed to the modeless or transparent transitions between the spoken query state when the user presents queries to search for desired commands and the command mode when the user controls the system by speaking the located actual commands. In the dynamic operation of the systems and methods covered by these copending applications, the user may still be barraged by a substantial number of displayed proposed commands.