When a camera with speech recognition capabilities is used by a consumer, the camera responds to user inquiries and operating commands by virtue of the spoken word. This capability is known in prior art such as U.S. Pat. No. 4,951,079 by Hoshino et. al. It is also known to have a speech recognition camera with several vocabularies and word sets for the purposes of annotation of images, such as in U.S. Pat. No. 5,546,145 by Bernardi et. al.
Current speech recognition technology does not allow for large vocabularies (65,000 words) or continuous naturally speaking speech recognition in a consumer product as small and inexpensive as a camera. Therefore, cameras with speech recognition capabilities generally have small word sets (10 to 15 words) and a discrete speech characteristic. There could be several small word sets, but the speech recognition camera could only be listening for one set at a time.
In a camera with several voice command words, it is required that the user remember the words or phrases which are available to speak into the camera. Furthermore, the word sets may be structured into a hierarchy such that one has to remember where one is in the hierarchy to know which words are valid at the current level of the hierarchy. Although the presently known and utilized cameras are satisfactory, they are not without drawbacks. The above-described prior art for cameras of this nature does not address the issue of aiding the user in remembering the available command set.