1. Field of the Invention
This invention relates to the continuous speech recognition technology, and more particularly to the apparatus and method for automatically generating punctuation marks in continuous speech recognition.
2. Related Art
A general speech recognition system can be shown as in FIG. 1. The system generally contains an acoustic model 7 and a language model 8. The acoustic model 7 includes the pronunciations of commonly used words in the recognized language. Such a word pronunciation is summarized by using a statistical method from the pronunciations when most people read this word and represents the general pronunciation characteristic of the word. The language model 8 includes the methods by which the commonly used words in the recognized language are utilized.
The operation procedure of the continuous speech recognition system shown in FIG. 1 is as follows: voice detection means 1 collects user""s speech, for example, expresses the language in speech samples, and sends the speech samples to the pronunciation probability calculation means 2. For every pronunciation in the acoustic model 7, pronunciation probability calculation means 2 gives the probability estimation value of whether it is the same as the speech sample. The word probability calculation means 5, according to the language rules summarized from a large amount of language materials, gives the probability estimation value for the word in the language model 8 of whether it is the word that should occur in the current context. The word matching means 3 calculates a joint probability (representing the ability of recognizing the speech sample as this word) through combining the probability value calculated by pronunciation probability calculation means 2 with the probability value calculated by the word probability calculation means 5, and takes the word with the greatest joint probability value as the result of the speech recognition. The context generating means 4 modifies the current context by using the above described recognition result, to be used in the recognition of the next speech sample. The word output means 6 outputs the recognized word.
The above continuous recognition procedure can be performed in units of a character, a word, or a phrase. Therefore, thereafter a word will refer to a character, a word, or a phrase.
To mark the recognized result with punctuation, current continuous speech recognition system requires punctuation marks being spoken during dictation, and then recognizes them. For example, to recognize xe2x80x9cHello! World.xe2x80x9d completely, the speaker must say, xe2x80x9cHello exclamation point world periodxe2x80x9d. That is, in current speech recognition system it is required that punctuation marks have to be converted into speech by the speaker (i.e. the punctuation marks have to be spoken out), and then recognized as corresponding punctuation marks by speech recognition system. So it is required that the language model includes punctuation marks, i.e. language model 8 is able to give the estimation probability value for every punctuation mark of whether it is the punctuation mark that should occur in current context.
However, it cannot be expected that people say punctuation marks when transcribing a natural speech activity (e.g. in conference, radio broadcast and TV program etc.) by using the above mentioned speech recognition system. Furthermore, it is highly unnatural to speak out punctuation marks during dictation. Even when being asked to do so, people often forget to speak out punctuation marks during speaking or reading articles. Moreover, in spontaneous speech dictation while every sentence comes directly from mind, it is very difficult for most people to correctly decide punctuation marks that should be used and to speak out every punctuation mark correctly without the loss of fluency. This may be the result of the fact that punctuation marks are seldom, if not never, used in daily spoken language.
Therefore, in continuous speech recognition there is an urgent need for an apparatus and method for automatically generating punctuation marks, which should be easily used and does not require punctuation marks being spoken out in speech, and hence should not affect user""s normal speech.
The first object of this invention is to provided an apparatus for automatically generating punctuation marks in continuous speech recognition.
The second object of this invention is to provided a method for automatically generating punctuation marks in continuous speech recognition.
To achieve the first object, the invention provides an apparatus for automatically generating punctuation marks in continuous speech recognition, comprising a speech recognition means for recognizing user""s speech as words. This speech recognition means also recognizes pseudo noises in user""s speech; and further comprising: pseudo noise marking means for marking pseudo noises in output results of the speech recognition means; a punctuation mark generating means for generating punctuation marks corresponding to the most likely pseudo punctuation marks by finding most likely pseudo punctuation marks at every location of pseudo noises marked by the pseudo noise marking means based on a language model containing pseudo punctuation marks.
The invention further provides an apparatus for automatically generating punctuation marks in continuous speech recognition, comprising a speech recognition means for recognizing user""s speech as words; punctuation mark location indicating means for generating a location indicating signal in response to the user""s operation during user""s dictation, said location indicating signal indicating a location in the output result of the speech recognition means; pseudo punctuation mark probability calculating means for giving the probability estimation value for every pseudo punctuation mark contained in the language model containing pseudo punctuation marks that it will occur in the output result of the speech recognition means; punctuation mark matching means for generating a punctuation mark corresponding to the pseudo punctuation mark by finding the pseudo punctuation mark at the location indicated by the location indicating signal based on the probability estimation value calculated by the pseudo punctuation mark probability calculating means.
To achieve the second object, the invention provides a method for automatically generating punctuation marks in continuous speech recognition, comprising a speech recognition step for recognizing user""s speech as words, said speech recognition step also recognizing pseudo noises in the user speech; and further comprising the following steps: pseudo noise marking step for marking pseudo noises in output results of the speech recognition step; a punctuation mark generating step for generating punctuation marks corresponding to the most likely pseudo punctuation marks by finding most likely pseudo punctuation marks at every location of pseudo noises marked in the pseudo noise marking step based on a language model containing pseudo punctuation marks.
The invention further provides a method for automatically generating punctuation marks in continuous speech recognition, comprising a speech recognition step for recognizing user""s speech as words; punctuation mark location indicating step for generating it location indicating signal in response to the user""s operation during user""s dictation, said location indicating signal indicating a location in the output result of the speech recognition step; pseudo punctuation mark probability calculating step for giving the probability estimation value for every pseudo punctuation mark contained in the language model containing pseudo punctuation mark that it will occur in the output result of the speech recognition step; punctuation mark matching step for generating a punctuation mark corresponding to the pseudo punctuation mark by finding the pseudo punctuation mark at the location indicated by the location indicating signal based on the probability estimation value calculated by the pseudo punctuation mark probability calculating step.
According to the apparatus and method of the invention, it is not necessary for a user to speak out punctuation marks for the system can automatically generate punctuation marks. Therefore, with the apparatus and method of the invention, the fluency of user""s speech will not be affected and the correctness and fastness in speech recognition system can be enhanced.