With reference to FIG. 1, a method of automatic processing according to the prior art will be described within the framework of voice synthesis.
This method includes an automatic step 2 of determination of a sequence of probability models which represent a given text.
Conventionally, the probability models used are a finite number of so-called hidden Markov models or HMM which describe the probability of acoustic production of symbolic units of a phonological nature.
At the same time as step 2, the method includes a step 4 of determination of a sequence of digital data strings corresponding to the diction of the same given text, or acoustic strings.
The method then includes a step 6 of alignment between the sequence of acoustic strings and the sequence of models.
Thus each symbolic unit of phonological order represented by one or several models has associated with it a sub-sequence of acoustic strings known as an “acoustic segment”.
For example, these associations between a symbolic unit and an acoustic segment are memorised individually in order to permit subsequent speech synthesis by generating a sequence of acoustic strings corresponding to a text other than the aforementioned given text.
However, variations may appear at the time of the alignment step 6 resulting in particular from differences between the speech signal as really pronounced and the sequence of models corresponding to a theoretical pronunciation.
In fact, step 2 of determination of a sequence of models associates a single model sequence with a given text.
However, the diction of this text may give rise to different speech signals due to the influence of the speaker. In particular, phonetic units or phonemes may be associated with each other as in the case of liaisons, or also other phonemes may be omitted or lengthened.
Such variations may involve the association of a model with an erroneous and/or displaced acoustic segment, thus introducing an error of alignment into the following acoustic segments.
The result of these variations is the necessity of introducing, for each association between an acoustic segment and one or several models, a confidence index during step 8 which enables a probability score to be attributed to each association.
However, in the methods according to the prior art, these confidence indices calculated for each model are not very precise.
In particular, these confidence indices are calculated essentially from the probabilities of transition from one model to the other. Thus these confidence indices are directly calculated for a segment of acoustic strings involving a low degree of precision.
Conventionally, these confidence indices only permit the rejection of certain associations which are corrected manually by specialists during a long and costly correction step 10.
It is therefore apparent that in the methods according to the prior art the precision of the confidence indices is insufficient, thus making the processing methods long and costly due to the necessity of human interventions for corrections.
The object of the present invention is to remedy this problem by defining an automatic method of processing which includes a confidence index with increased precision.