The present invention relates to a sentence recognition apparatus that uses, for example, speech recognition or text sentence recognition, a sentence recognition method, a program, and a medium.
The prior art will be described by taking a speech recognition means as an example.
In a speech recognition means, if an error occurs due to incomplete recognition, and the result is output without correcting the error, that will present a serious problem in practical implementation.
To solve this problem, the prior art proposes a method in which if the recognition score of the first candidate in the recognition result is not greater by more than a predetermined value than the recognition score of the second or later candidate, it is then determined that the confidence of the recognition result is low. The sentence produced as the recognition result is rejected or a re-entry is requested.
This example will be described in further detail with reference to an example that uses a one-pass, n-best search which is a typical search means employed, for example, in a continuous speech recognition means.
The acoustic feature of each phoneme is extracted in advance by using a training speech DB, and the probability of connection between words each represented by a string of phonemes is also computed in advance by using a text DB. When performing recognition, the acoustic feature of input speech per unit time is analyzed, and the amount of the feature, in the form of a time series, is compared with the amount of the pre-learned acoustic feature of each phoneme, to compute an acoustic score which represents the probability that the input voice at each instant in time is a phoneme.
Acoustic scores are summed in time series in accordance with the string of phonemes in each word carried in a word dictionary, and the sum is the acoustic score at each instant in time. If a search space for all the phoneme strings cannot be secured, the process proceeds while leaving only N best results ranked in order of decreasing score.
If the input voice contains a plurality of words, the words are connected by referring to the pre-learned word connection probability and, when connected, the word connection probability (called the language score) is added to the acoustic score.
When the recognition scores of the N best candidates are thus computed, if the difference between the first candidate and the second candidate is not larger than a predetermined value, it is determined that the confidence of the result of the first candidate is low, and the result is rejected (for example, Jitsuhiro et al., xe2x80x9cRejection by Confidence Measure Based on Likelihood Difference Between Competing Phonemesxe2x80x9d, Technical Report of IEICE, SP 97-76, pp. 1-7 (1997)).
However, the above recognition score indicates the similarity between the input voice and the pre-learned acoustic model or language model, and the reality is that the value varies greatly, depending on the speaker or on how the voice is uttered, even if correct recognition is done. It is therefore extremely difficult to determine the score ratio threshold for rejection, and this has often resulted in the rejection of a correct recognition result or the output of an incorrect recognition result by erroneously judging it to be a correct recognition result.
As a result, it has been difficult to perform proper sentence recognition by using speech recognition or text sentence recognition.
In view of the above-described problem of the prior art, it is an object of the present invention to provide a sentence recognition apparatus, a sentence recognition method, a program, and a medium, that can perform proper sentence recognition by using speech recognition or text sentence recognition.
One aspect of the present invention is a sentence recognition apparatus comprising:
a data base for storing a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words;
sentence recognition means of recognizing an input sentence made up of a plurality of words;
specific word selection means of selecting said specific words from among the plurality of words forming said recognized sentence;
judging means of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and
erroneously recognized specific word determining means of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
Another aspect of the present invention is a sentence recognition apparatus, wherein said erroneously recognized specific word determining means determines a specific word as being said erroneously recognized specific word if said specific word is found in more than a predetermined number of arbitrarily formed specific word pairs that have been judged as not matching any of the standard specific word pairs stored in said data base.
Still another aspect of the present invention is a sentence recognition apparatus, further comprising re-entry requesting means of requesting, in the event of occurrence of said erroneously recognized specific word, (1) a re-entry of the specific word corresponding to said erroneously recognized specific word or (2) a re-entry of said input sentence.
Yet still another aspect of the present invention is a sentence recognition apparatus, further comprising notifying means of notifying a user of the occurrence of said erroneously recognized specific word when said erroneously recognized specific word does occur.
Still yet another aspect of the present invention is a sentence recognition apparatus comprising:
a data base for storing a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words;
sentence recognition means of recognizing an input sentence made up of a plurality of words;
specific word selection means of selecting said specific words from among the plurality of words forming said recognized sentence;
judging means of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and
sentence erroneous recognition determining means of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
A further aspect of the present invention is a sentence recognition apparatus, further comprising sentence re-entry requesting means of requesting a re-entry of said input sentence in the event of occurrence of said erroneous recognition.
A still further aspect of the present invention is a sentence recognition apparatus, further comprising notifying means of notifying a user of the occurrence of said erroneous recognition when said erroneous recognition does occur.
A yet further aspect of the present invention is a sentence recognition apparatus comprising:
a first data base for storing correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong;
a second data base for storing a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes;
sentence recognition means of recognizing an input sentence made up of a plurality of words;
specific word selection means of selecting said specific words from among the plurality of words forming said recognized sentence;
specific word class determining means of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong;
judging means of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and
erroneously recognized specific word determining means of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
A still yet further aspect of the present invention is a sentence recognition apparatus, wherein said erroneously recognized specific word determining means determines a specific word as being said erroneously recognized specific word if the specific word class to which said specific word belongs is found in more than a predetermined number of arbitrarily formed specific word class pairs that have been judged as not matching any of the standard specific word class pairs stored in said second data base.
An additional aspect of the present invention is a sentence recognition apparatus, further comprising re-entry requesting means of requesting, in the event of occurrence of said erroneously recognized specific word, (1) a re-entry of the specific word corresponding to said erroneously recognized specific word or (2) a re-entry of said input sentence.
A still additional aspect of the present invention is a sentence recognition apparatus, further comprising notifying means of notifying a user of the occurrence of said erroneously recognized specific word when said erroneously recognized specific word does occur.
A yet additional aspect of the present invention is a sentence recognition apparatus comprising:
a first data base for storing correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong;
a second data base for storing a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes;
sentence recognition means of recognizing an input sentence made up of a plurality of words;
specific word selection means of selecting said specific words from among the plurality of words forming said recognized sentence;
specific word class determining means of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong;
judging means of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and
sentence erroneous recognition determining means of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
A still yet additional aspect of the present invention is a sentence recognition apparatus, further comprising sentence re-entry requesting means of requesting a re-entry of said input sentence in the event of occurrence of said erroneous recognition.
A supplementary aspect of the present invention is a sentence recognition apparatus, further comprising notifying means of notifying a user of the occurrence of said erroneous recognition when said erroneous recognition does occur.
A still supplementary aspect of the present invention is a sentence recognition method comprising:
a storing step of storing in a data base a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words;
a sentence recognition step of recognizing an input sentence made up of a plurality of words;
a specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence;
a judging step of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and
an erroneously recognized specific word determining step of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
A yet supplementary aspect of the present invention is a sentence recognition method comprising:
a storing step of storing in a data base a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words;
a sentence recognition step of recognizing an input sentence made up of a plurality of words;
a specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence;
a judging step of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and
a sentence erroneous recognition determining step of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
A still yet supplementary aspect of the present invention is a sentence recognition method comprising:
a first storing step of storing, in a first data base, correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong;
a second storing step of storing in a second data base a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes;
a sentence recognition step of recognizing an input sentence made up of a plurality of words;
a specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence;
a specific word class determining step of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong;
a judging step of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and
an erroneously recognized specific word determining step of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
Another aspect of the present invention is a sentence recognition method comprising:
a first storing step of storing, in a first data base, correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong;
a second storing step of storing in a second data base a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes;
a sentence recognition step of recognizing an input sentence made up of a plurality of words;
a specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence;
a specific word class determining step of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong;
a judging step of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and
a sentence erroneous recognition determining step of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
Still another aspect of the present invention is a program for causing a computer to carry out all or part of the steps in the sentence recognition method, said steps comprising: the storing step of storing in a data base a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words; the sentence recognition step of recognizing an input sentence made up of a plurality of words; the specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence; the judging step of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and the erroneously recognized specific word determining step of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
Yet still another aspect of the present invention is a program for causing a computer to carry out all or part of the steps in the sentence recognition method, said steps comprising: the storing step of storing in a data base a plurality of predetermined standard specific word pairs each formed from a plurality of predetermined specific words; the sentence recognition step of recognizing an input sentence made up of a plurality of words; the specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence; the judging step of judging whether a specific word pair arbitrarily formed from said selected specific words matches any one of the standard specific word pairs stored in said data base; and the sentence erroneous recognition determining step of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
Still yet another aspect of the present invention is a program for causing a computer to carry out all or part of the steps in the sentence recognition method, said steps comprising: the first storing step of storing, in a first data base, correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong; the second storing step of storing in a second data base a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes; the sentence recognition step of recognizing an input sentence made up of a plurality of words; the specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence; the specific word class determining step of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong; the judging step of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and the erroneously recognized specific word determining step of determining, based on the result of said judgement, an erroneously recognized specific word for which said recognition failed from among said selected specific words.
A further aspect of the present invention is a program for causing a computer to carry out all or part of the steps in the sentence recognition method, said steps comprising: the first storing step of storing, in a first data base, correspondences between a plurality of predetermined specific words and a plurality of specific word classes to which said specific words belong; the second storing step of storing in a second data base a plurality of predetermined standard specific word class pairs each formed from two of said predetermined specific word classes; the sentence recognition step of recognizing an input sentence made up of a plurality of words; the specific word selection step of selecting said specific words from among the plurality of words forming said recognized sentence; the specific word class determining step of determining, by utilizing the correspondences stored in said first data base, the specific word classes to which said selected specific words respectively belong; the judging step of judging whether a specific word class pair arbitrarily formed from said determined specific word classes matches any one of the standard specific word class pairs stored in said second data base; and the sentence erroneous recognition determining step of determining, based on the result of said judgement, whether said input sentence has been erroneously recognized or not.
A still further aspect of the present invention is a medium holding thereon the program, wherein said medium is computer processable.
A yet further aspect of the present invention is a medium holding thereon the program, wherein said medium is computer processable.
A still yet further aspect of the present invention is a medium holding thereon the program, wherein said medium is computer processable.
An additional aspect of the present invention is a medium holding thereon the program, wherein said medium is computer processable.
It will be noted that (1) in a speech recognition means that deduces an erroneously recognized word from the relations between the specific words contained in the recognized sentence and produces an output by reflecting the result of the deduction in the recognized sentence, a result rejecting means or a re-entry requesting means that requests the user for a re-entry when all or many of the words used for the deduction of erroneously recognized words are deduced as being erroneously recognized words, and (2) a result rejecting means or a re-entry requesting means that requests the user for a re-entry when none or few of the words contained in the recognized sentence match pre-learned specific word or word class pairs having dependency or co-occurrence relations between them, are also included in the present invention.
Such a rejecting means comprises, for example, a continuous speech recognition means of recognizing speech comprising a plurality of words, an important word extracting means of extracting specific words from the result of the recognition, a confidence computing means of assessing the confidence of the recognition result by examining the dependency or co-occurrence relations between the extracted words, a rejection determining means of rejecting the result when the result lacks confidence, and an output sentence generating means of generating a re-entry requesting sentence when the result is rejected.
In this rejecting means, specific words are extracted from the recognized sentence, the extracted words are searched through for word pairs having dependency or co-occurrence relations between them, and when none or few of such word pairs are found, the recognition result is rejected, thereby enabling an erroneous result to be rejected consistently even if the speaker or the way the voice is uttered changes.
The result rejecting means or re-entry requesting means that uses word classes determined by using relations between words contained in a commonly used thesaurus dictionary and a training sentence set is also included in the present invention.
Such a rejecting means comprises, for example, a word class determining means of classifying important words, a word class relationship table in which the relationships between word classes are defined, a continuous speech recognition means of recognizing speech comprising a plurality of words, an important word extracting means of extracting specific words from the result of the recognition, a confidence computing means of assessing the confidence of the recognition result by examining the dependency or co-occurrence relations between the extracted words, a rejection determining means of rejecting the result when the result lacks confidence, and an output sentence generating means of generating a re-entry requesting sentence when the result is rejected.
In this rejecting means, words are optimally classified in advance, and the dependency or co-occurrence relations between the word classes are examined and stored in a table. When performing recognition, specific words are extracted from the recognized sentence, the extracted words are searched through for word pairs having dependency or co-occurrence relations between them by using the relationship table in which the dependency or co-occurrence relations are defined, and when none or few of such word pairs are found, the recognition result is rejected, thereby enabling an erroneous result to be rejected consistently even if the speaker or the way the voice is uttered changes. Furthermore, the rejection or the re-entry requesting operation can be performed even when a word not contained in a sentence set used to learn the relationships between words is entered for recognition.