The present invention relates to an apparatus for introducing control words by speech and concerns an apparatus in which the training operation is performed by the user himself as he uses the piece of equipment associated with the word introduction apparatus. The expressions "words" and "training" will be explained below.
Bearing in mind the greater ease of use which is achieved as a result, is is increasingly envisaged that some pieces of equipment may be controlled directly by speech. It will be appreciated that such an apparatus must comprise at its input a means for introducing words, which will actually serve to control the piece of equipment. Irrespective of the type of equipment to be controlled, it must have for control purposes a vocabulary which essentially comprises instructions and data. Instructions are most frequently in the form of words or groups of a limited number of words, while the data are in the form of figures or numbers. In this specification, the expression "word" will be used to cover the whole of the instructions and data to be introduced into the machine. A "word" will therefore cover, on the one hand, single words and, on the other hand, multiple instructions and data (including numbers and groups of a few words).
Speech-controlled equipment all operates using substantially the same general procedure. The equipment comprises memories for recording information representing in decoded form the different words of the vocabulary which are required to control the equipment and which will be referred to hereinafter as references. When the user actually wishes to control the piece of equipment, he pronounces a word from that vocabulary. The input apparatus converts that word into an electrical signal which in turn is coded using the same coding as that which was used to introduce the memorized information forming the vocabulary of the equipment. The word introduction apparatus compares the coded word to the different references contained in its memory and selects the memorized reference which is closest to the word which has been pronounced. It is the word associated with that reference which will be used to control the apparatus.
The phase which involves introducing into the memory the references corresponding to the whole of the words forming the vocabulary of the equipment will be referred to as training. It will be clear that the quality of the training is a decisive factor in regard to the quality and reliability of controlling the equipment by means of speech.
Two types of training are generally envisaged. In the first type, which can be referred to as a pre-programmed training, the references are initially introduced in the factory by a model or standard speaker. Such references may also comprise model of standard references which are defined by statistical analysis of the different ways of pronouncing a word. The pieces of information corresponding to the different words of the vocabulary are therefore memorized once and for all and definitively. This type of training has the advantage that the user of the equipment can immediately make use thereof, without his having to carry out the training operation for the equipment. However, the major disadvantage of such a training process is that the memorized vocabulary words have been pronounced by a model of a standard speaker whereas the word introduced into the machine for controlling it are pronounced by the user. It is extremely likely the the same word will be pronounced differently by the standard speaker and by the user. In order for the equipment to operate satisfactorily, that is to say, so that the word pronounced by the user is actually recognized by the equipment, it is necessary to provide fairly complicated word coding and a very highly developed comparison algorithm in order to overcome the problems involved in the differences in pronounciation of the same word. Using the highly elaborate comparison algorithm and a very precise coding process gives rise to substantial complication in the coding and memory circuits, that is to say, that increases the costs of such circuits and the area of silicon required to form such circuit. Now, in some uses, it is not possible to allow for a high level of cost for production of the word introduction device, or to use a substantial amount of space in the overall apparatus for incorporating the introduction circuit.
The other type of training may be referred to as initial training. On leaving the factory, the word introduction apparatus does not have any stored information regarding the words of the vocabulary of the piece of equipment. The training operation is initially performed by the user of the apparatus himself. This training process permits good quality of identification as between the reference words and the word pronounced, since it is the user himself who introduced the reference words. The major disadvantage of this training process is that the user must himself introduce the reference vocabulary into the equipment and that the quality of this phase governs the subsequent quality of operation of the equipment. In addition, this training phase, which is the first encounter of the user has contact with the equipment has a discouraging and disheartening effect on the user. That factor makes it more difficult to commercialize an apparatus using such a method of training. In addition, this training procedure suffers from two limitations. Firstly it is truly effective only if the equipment has only one user. On the other hand, it is easy to find that the same speaker will pronounce the same word in different ways at different times, for various reasons. This means that there is the danger of losing the advantage of the user himself carrying out the training.
It is possible to envisage controlling a variety of items of equipment directly by speech, by means of a given vocabulary. Control of this type may be used in particular, by way of example, for a multi-function watch. Such a watch may give the local time (hour-minute-second-date-day of the week). It may give the time in different time zones; it may have several alarm times; is may also have a stop-watch function, etc. Usually, the operation of selecting one of these functions is effected in electronic watches by means of stems or push-buttons. Those members serve both to select a function and also to introduce data relating to such functions, for example to correct the time, to select the time zone, to display an alarm time, etc. When the watch has a large number of functions and when in addition the number of control members is to be limited, the method of using the different control members becomes increasingly complicated to remember, particularly for controlling functions which are infrequently used. This complexity may also cause unfortunate errors. It is easy to see that control by means of speech makes it possible to eliminate or very substantially reduce the number of control members and to use a control procedure which is much closer to the natural procedure, which involves saying what is to be done.
As already indicated above, the apparatus which is the subject of the present invention is applied particularly to multi-function electronic watches, preferably of the digital display type. It will be appreciated however that the apparatus could also be used for controlling other equipment.