The invention relates to a method of executing a data base query by means of a data processing arrangement, the query being input by a user in the form of a plurality of speech utterances in natural speech and the data processing arrangement producing a speech output in response to each speech utterance and a recognition arrangement converting each speech utterance into at least one set of statements of the highest acoustic probability using a language model, which statements are tested for consistency, consistent statements being stored, and to an arrangement suitable for this purpose.
Such a method and a corresponding arrangement are known from DE 196 39 843.6 A1 (PHD 96.167). The statements of the highest probability, which have been derived from each speech utterance and which have been tested successfully for consistency, are stored therein. These statements are used for testing the statements derived from the next speech utterance for consistency and may eventually be used for the data base query. By means of a speech output issued by the system the user is prompted so many times to give a speech response until all the statements necessary for a data base query have been acquired. Thus, each speech output issued by the system depends to a limited extent on the preceding speech utterances and the statements derived therefrom.
However, with this method it is possible that the correct statement intended by the user through the speech utterance is not recognized with the highest probability but with a lower probability, for example due to an unsatisfactory pronunciation by the user. Since these statements of lower probability are not pursued any further, it is eventually possible, in the case that the dialogue with the user is continued with an incorrect statement recognized with the highest probability and this statement is not corrected, that the final data base query is derived from incorrect statements.
From WO 96/13030 a method of and an arrangement for a telephone inquiry service is known in which a plurality of statements are derived from each speech utterance of the user and are stored. However, the speech outputs presented to the user by the system proceed in accordance with a fixed scheme and the statements derived hitherto are used in order to reduce the amount of data from the data base with which the statements derived from the next speech utterance are compared.
It is an object of the invention to provide a method of the type defined in the opening paragraph, by means of which it is possible, in a wide variety of applications, to derive all the statements necessary for a data base query in a manner which is as reliable as possible and as convenient as possible for the user.
According to the invention this object is achieved in that after each speech utterance all the sets of statements derived therefrom are tested for consistency with all the stored sets of statements and the derived statements which have been tested successfully for consistency are stored, and at least one speech output is derived from stored statements.
Thus, not all the statements which are consistent and, consequently, useful are stored but these statements as well as previously determined statements are preferably used in each dialogue step in order to derive the next speech output to be issued by the system from these statements. As a result of this, it is then possible, for example, not only to generate general speech outputs, for example relating to the city and street of the desired subscriber in the case of a telephone inquiry service or a station or time of departure or destination in the case of a train schedule inquiry service but it is likewise possible to ask the user specific questions, for example in order to verify given statements, i.e. to prompt the user to repeat such statements, if desired in an alternative form.
The individual statements can be derived from a speech utterance by determining all the words of adequate individual probability in the speech signal or also in a manner as described in EP 702 353 A2 (PHD 94.120). In said method a word graph is derived from the speech utterance, from whose edges only those statements or that information is extracted which is relevant to the data base query. Moreover, general language models and dedicated rules may be adopted. For example, the statements xe2x80x9cp.m.xe2x80x9dand xe2x80x9c3 o""clockxe2x80x9dare equivalent to the statement xe2x80x9c15.00 hoursxe2x80x9d.
Particularly with this known method of deriving statements from a speech utterance different statements are obtained for the same category of statements such as for example names, time indications etc., but these have different probabilities as a result of different similarity to the speech utterance and by virtue of further rules such as language models. Thus, an embodiment of the invention is characterized in that each statement is stored with a probability derived from the probability assigned to this statement and the highest probability of the stored statement which has been tested successfully for consistency. When during recognition, for example, several names are derived with different probabilities and at least some names have already been stored, those names are stored with a combined probability determined by the probability of the statement derived from the last speech utterance and of the previously stored statement.
When statements are derived from a speech signal the recognition arrangement supply a limited number of statements, for example a given number of statements or statements having probabilities above a given threshold. In general, this results in the total number of all statements being increased upon each dialogue step, i.e. upon each new speech utterance. In order to limit this effect, in accordance with a further embodiment of the invention , it is effective that only those statements are stored whose probabilities exceed a threshold value. This relates both to the combined probability which results from the probability of the statement itself and the most probable consistent statement stored.
When several sequences of statements are derived from a speech utterance it is also possible to form reliability values for these statements from the individual probabilities of the sets of statements including this statement. In this case, it is effective in a further embodiment of the invention that only those statements are stored whose reliability values exceed a threshold value. As a result of this, the number of statements to be stored and processed until the final generation of the data base query can be limited.
It is another object of the invention to provide an arrangement which enables the statements for a data base query to be determined in a most reliable manner which is convenient for the user. This object is achieved by means of the characteristic features defined in the further independent Claim.