Much work has been done in the past decades in the speech recognition and natural language understanding (NLU) fields, with one goal that a computer understand and/or react to the content of an utterance by a human. The corresponding methods used to date are linguistic dependant and do not utilize the real advantages of the computer, which include the ability to store a significant amount of information, the ability to quickly retrieve individual references from a large body of stored information, and the ability to quickly perform calculations.
The application of dictation mode NLU systems and command based systems seems to be confined to the arena of automated attendants and automation of visual applications. For example, the Speech Application Language Tags (SALT) standard assumes a person is looking at a screen and simply wants to find a vocal way to select a choice on the screen instead of using a keyboard or mouse. However, it would be desirable and highly advantageous to have truly auditory computer applications which can be used by the average novice, non-technical person without the use of any visual or pointing device present, unless they specifically choose to view a picture, technical drawing or other object which “must be seen to be appreciated”.