The present invention relates to interactive computer controlled display systems with speech word recognition and, more particularly, to such systems which receive audible input via non-verbal sound recognition to provide system commands.
The 1990""s decade has been marked by a technological revolution driven by the convergence of the data processing industry with the consumer electronics industry. This advance has been even further accelerated by the extensive consumer and business involvement in the Internet over the past few years. As a result of these changes, it seems as if virtually all aspects of human endeavor in the industrialized world require human/computer interfaces. There is a need to make computer directed activities accessible to people who, up to a few years ago, were computer illiterate or, at best, computer indifferent.
Thus, there is continuing demand for interfaces to computers and networks which improve the ease of use for the interactive user to access functions and data from the computer. With desktop-like interfaces including windows and icons, as well as three-dimensional virtual reality simulating interfaces, the computer industry has been working hard to fulfill such user interaction by making interfaces more user friendly by making the human/computer interfaces closer and closer to real world interfaces, e.g. human/human interfaces. In such an environment, it would be expected that speaking to the computer in natural language would be a very natural way of interfacing with the computer for even novice users. Despite the potential advantages of speech recognition computer interfaces, this technology has been relatively slow in gaining extensive user acceptance.
Speech recognition technology has been available for over twenty years, but it has only recently begun to find commercial acceptance, particularly with speech dictation or xe2x80x9cspeech to textxe2x80x9d systems, such as those marketed by International Business Machines Corporation (IBM) and Dragon Systems. That aspect of the technology is now expected to have accelerated development until it will have a substantial niche in the word processing market. On the other hand, a more universal application of speech recognition input to computers, which is still behind expectations in user acceptance, is in command and control technology; wherein, for example, a user may navigate through a computer system""s Graphical User Interface (GUI) by the user speaking the commands which are customarily found in the system""s menu text, icons, labels, buttons, etc.
Many of the deficiencies in speech recognition both in word processing and in command technologies are due to inherent speech recognition errors, due in part to the recognition system distinguishing between speech words which are to be converted into strings of displayed text and the above-described verbal commands. The above-mentioned copending patent applications are all directed to implementations for distinguishing speech words from verbal commands. Since the commands are verbal, the processes for distinguishing the commands from verbal speech words is complex.
The present invention is directed towards simplifying command recognition from speech term recognition in speech recognition technology. The invention provides for a system for recognizing non-verbal sound commands within an interactive computer controlled display system with speech word recognition, which comprises standard means for recognizing speech words in combination with means for storing a plurality of non-verbal sounds, each sound representative of a command. There are display means responsive to said means recognizing speech words for displaying said recognized words. In response to the input of non-verbal sounds, there are means for comparing the input non-verbal sounds to said stored sounds together with means responsive to said comparing means for carrying out the command represented by a stored sound which compares to an input non-verbal sound. The non-verbal sounds may be voice generated or they may be otherwise physically generated. The commands may direct movement of data, e.g. cursors displayed on said display system. In such a case, means are provided for inputting a sequential list of the sounds representative of said command directing movement to thereby produce a sequential movement of said displayed data, e.g. cursor movement.