1. Field of the Invention
The invention relates to a method and a device for phonetizing text-containing data records, particularly with different contents, such as music tracks, music artists, music albums or telephone book entries, contact names or the like, that are used in voice-controlled user interfaces for controlling particular processes in which the user forwards voice commands containing these contents to the user interface. Without limiting the invention to this preferred instance of application, a preferred field of application for the invention is in the area of motor vehicle controllers, particularly multimedia control units in motor vehicles, which are used for information, entertainment and/or communication in motor vehicles. Control units of this kind can contain music reproduction and telephone functions, in particular.
In the case of the method proposed according to the invention, the data records, which are present as graphemes, as a sequence of individual grapheme symbols, particularly as a letter sequence or standardized letter sequence, are converted into phonemes, i.e. a sequence of individual phoneme symbols, and stored as phonetized data records, for example in a phonetized data list. According to the standard definition, a phoneme is a representation of sound that forms the smallest meaning-distinguishing unit in a language, and has a distinctive function. In the present text, the term “phoneme” is understood particularly as a sequence of a plurality of individual phoneme symbols. A corresponding situation applies to the term grapheme, which is understood particularly as a sequence of individual grapheme symbols in the present text. In a similar manner to a phoneme, a grapheme (grapheme symbol) is the smallest meaning-distinguishing unit in the graphical representation of a text, and is frequently defined by the letters of an alphabet.
In the proposed method, the graphemes are conditioned for the actual phonetization in preprocessing, particularly by virtue of the graphemes being modified on a language-defined and/or user-defined basis before the conversion into phonemes is performed. The phonetized data list, for example in the form of the phonetized data records, can then be used in a manner known per se for the voice recognition in a voice-controlled user interface, for example.
The preprocessing has the background that the graphemes (and also the phonemes) are language-related, and depend on the respective language used. Frequently, however, the actual data records contain entries in different languages that need to be identified and adjusted for phonetization. Accordingly, the preprocessing can be implemented by recognition of foreign-language texts but also by replacement of abbreviations, omission of prefixes (such as “Herr”, “Frau”, “Dr.”, the English article “the” or the like), expansion of acronyms and/or provision of pronunciation variants, which can be selected by the user.
Such preprocessing allows the usually voice-related restrictions of grapheme-to-phoneme conversion, in which only a particular prescribed number of digits and character strings that are to be spelt is supported, to be at least partially lifted by replacing those characters of the graphemes that are not supported by the language-dependent acoustic model used for the phonetization.
2. Related Art
In existing systems, however, the preprocessing has the problem that these methods are upstream of the actual grapheme-to-phoneme conversions, and the time that is needed for the preprocessing is added to the total latency for the grapheme-to-phoneme conversion.
Since the preprocessing may also be very computation-intensive depending on the complexity involved, either long latencies can be expected or the performance of the preprocessing needs to be restricted, for example by ignoring unsupported characters in the grapheme representation during the phonetization. On account of the scarcity of resources for the preprocessing, the known implementations of preprocessing can also be adjusted only conditionally to specific application requirements and, in particular, firmly programmed, particularly in respect of the number of variants and the available replacements and modifications.