This invention generally relates to a method and system for assessing speech, in particular, to a method and system for assessing prosody of speech data.
Speech assessment is an important area in speech application technology, the main purpose of which is to assess the quality of input speech data. However, speech assessment technologies in the prior art mainly focus on assessing pronunciation of input speech data, namely, distinguishing and scoring pronunciation variance of speech data. Take the word “today” for example, the correct American pronunciation should be [t'de], whereas a reader can mispronounce it as [tu'de].
The existing speech assessment technologies can detect and correct incorrect pronunciations. If the input speech data is a sentence or a long paragraph rather than a word, the sentence or paragraph needs to be segmented first so as to perform force alignment between the input speech data and corresponding text data, and then an assessment is performed according to pronunciation variance of each word. In addition, most of the existing speech assessment products require a reader to read given speech information, which includes read text of some paragraph or read after a piece of standard speech, such that the input speech data is restricted by given content.