This invention addresses the techniques for speech recognition and in particular the issue of implementing a speaker independent automatic multilingual speech recognizer also suitable for languages for which speech training material is scant.
The invention is therefore targeted at being applied preferentially in situations where the use of automatic vocal recognition systems appears advantageous (such as at airport and rail station transit areas, shows and conferences, automatic vocal message controlled systems and the like for instance) and where recognition assurance is desired even when the speaker talks in a not particularly widespread language and/or in a language for which it is neither easy nor advantageous to collect the speech training material normally required for implementing a speech recognizer in a short time.
This invention has the purpose of supplying a speech recognizer of the type as specified above and according to the invention such purpose is achieved by means of a process having the features called out specifically in the claims that follow. The invention also concerns the recognizer for a given language implemented with this process as well as the multilingual recognizer created as an intermediate product of the process itself. Lastly, the invention also extends to the corresponding process for voice recognition.
The invention will now be described as a not limiting example with reference to the enclosed drawings, in which:
FIG. 1 illustrates at the general level the subsequent steps into which the process according to the invention divides;
FIGS. from 2 to 4 illustrate the implementation of some of the steps shown in the blocks of FIG. 1 in greater detail as flow charts.
In essence, the solution according to the invention firstly includes the implementation as an intermediate product of an automatic multilingual speech recognizer in a situation in which an ample amount of speech material is available for all the languages involved. Such automatic multilingual speech recognizer is subsequently exploited to interpolate a sufficiently robust recognizer for a language for which large information databases are unavailable. In the invention implementation form preferred for the time being, the acoustic phonetic models used are of the transitory/stationary type. These are the models sometimes referred to as APUCD-CTS (Acoustic Phonetic Unit Context Dependentxe2x80x94Class Transitory Stationary) in juxtaposition to context independent models, also called APUCI (Acoustic Phonetic Unit Context Independent).