This invention relates to a speech synthesis method and device for reproducing desirable sound information through the utilization of a number of phonemes.
It is generally known that several phonemes are used in combination to constitute numerical information in the form of an audible sound or synthesized voice in providing audible indications of numerical information. For instance, "2,534" (ni sen go hyaku san jyu yon in Japanese and its English version is two thousand, five hundred thirty four) may be audibly indicative of seven phonemes "ni", "sen", "go", "hyaku", "san", "jyu" and "yon." Accordingly, it is possible to provide an audible indication of numerical information by loading a necessary number of basic phonemes into a memory and fetching them in a given order from the memory for subsequent speech synthesis.
However, the results of our extensive researches reveal that a mere combination of those basic phonemes causes inconvenience for the listener's appreciation of audible indications as the case may be. It has also been found that in providing an audible indication of 12,300,450 (ni oku ni sen san byaku man yon sen go hyaku in Japanese and its English version is twelve million, three hundred thousand, four hundred and fifty), for example, a given period of silence or pause is needed immediately after "oku." Failure to locate such silence or pause period results in that the listener may hear the synthesized voices "oku" and "ni" very closely and face difficulty or eventually commit an error in dictating audible indications. This is also true of the spacing between "man" and "yon." It has also been made clear that a silence or pause period is necessary immediately before "hyaku" (hundred in English) and "jyu" (ten in English) in the case where numerical information bears "1" in hundred and must be pronounced in the form of only "hyaku" or bears "1" in tens and must be pronounced in only "jyu." For instance, such a silence or pause is required between "sen" and "hyaku" of "roku sen hyaku ni jyu" (its English version is six thousand, one hundred twenty) and between "jyu" and "hyaku" of "yon sen san byaku jyu ni" (its English version is four thousand, three hundred and twelve).
Furthermore, a silence period is needed just before an audible indication "ten" (its English version is "point") and, for example, between "ten" and "san" of "hyaku ni jyu san ten yon go (123.45)."
While the foregoing has set forth especially the situation where audible indications of numerical information accompany words indicative of respective units thereof, such silence or pause period is similarly required when audible indications are provided without unit information, for instance, before each three-digit punctuation and a decimal point: between, "ni" and "konma" of "ni konma san yon go konma roku nana hachi" (2,345.678) and between "san" and "ten" of "ichi ni san ten yon go" (123.45).