Artificial speech synthesis is being increasingly used at the present time in order to output information to a user by means of a computer. Speech synthesis is acquiring particular significance as a means of communication for outputting information to people within the scope of systems in which other output media, for example graphics, are not possible for reasons of space, for example because a monitor for presenting information is not available or cannot be used for reasons of space. Particularly for such a case in which other output media cannot be used for reasons of space, there is a need for a speech synthesis device and a method for speech synthesis which make very low demands on available resources in terms of the computing power and in terms of the storage space required and nevertheless provide fully functioning synthesis, for example for “reading out” a text, preferably an electronic message.
Known approaches which are not yet available on integrated systems (embedded systems) owing to their very large demands in terms of the storage space required are usually divided into speech synthesis systems, in which the speech synthesis is based on what is referred to as diphonic synthesis, and into speech synthesis systems which are based on what is referred to as corpus-based speech synthesis.
Even the diphonic synthesis systems for which a relatively small amount of storage space is sufficient require a storage space of approximately 20 Mbytes, and corpus-based speech synthesis systems require up to 1 Gbyte of storage space or more.
This storage space requirement is significantly too large to be able to be implemented in an embedded system.
A text-to-speech converter device in which the text-to-speech conversion is carried out for a described special exception lexicon is described in WO 00/45373 A1.
A parser device for determining predefined expressions from a speech signal sequence which is spoken into it is described in DE 691 31 549 T2.
The invention is based on the problem of providing a speech synthesis which requires a reduced amount of storage space in comparison with known speech synthesis methods or speech synthesis devices.