1. Field of the Invention
The present invention relates to a phoneme-symbol a posteriori probability calculating apparatus and a speech recognition apparatus, and in particular, to a phoneme-symbol a posteriori probability calculating apparatus for calculating a posteriori probabilities of phoneme symbols by using a phoneme-symbol a posteriori probability calculating model based on a speech signal of inputted utterance speech, and a speech recognition apparatus for performing speech recognition by using the phoneme-symbol a posteriori probability calculating model.
2. Description of the Prior Art
Conventionally, methods for calculating estimates of a posteriori probabilities of phoneme symbols necessary for speech recognition by using two results that can be obtained from a multi-layer perceptron (hereinafter, referred to as an MLP) or recurrent neural network (hereinafter, referred to as an RNN) and a hidden Markov model (hereinafter, referred to as an HMM) have been disclosed in, for example, a Prior Art Document 1, H. Bourland, et al., "Continuous Speech Recognition by Connectionist Statistical Methods", IEEE Transactions on Neural Networks, Vol. 4, No. 6, pp. 893-909, November 1993 (hereinafter, referred to as a first prior art), and a Prior Art Document 2, A. J. Robinson, "An Application of Recurrent Nets to Phone Probability Estimation", IEEE Transactions on Neural Network, Vol. 5, No. 2, March 1994 (hereinafter, referred to as a second prior art). In these first and second prior art, when a vector series x.sub.1, x.sub.2, . . . , x.sub.L of speech feature parameters of one frame is inputted in stead of an acoustic model such as an HMM, a phoneme symbol series which results in a maximum a posteriori probability Pr for such a phoneme symbol series that a phoneme symbol series c.sub.1, c.sub.2, . . . , c.sub.L can be observed can be expressed as follows: ##EQU1##
In this case, a reference character C denotes a set of all the phoneme symbols, and a function "argmax" is a phoneme symbol series which results in a maximum value of the argument over all the phoneme symbols included in the set C of all the phoneme symbols, where X is a matrix of speech feature parameters of one frame which comprises a vector series x.sub.1, x.sub.2, . . . , x.sub.L of speech feature parameters of one frame. Approximating the Equation (1) under such a condition that the independence among the frames is assumed yields the following equation: ##EQU2##
In the first prior art, the first term Pr.sub.1 of the argument of the function argmax in the final expression of the Equation (2) is modeled by MLP and the second term Pr.sub.2 is modeled by HMM, so that a phoneme symbol series which results in a maximum a posteriori probability Pr of phoneme symbol series can be determined by using the MLP model and the HMM model. In the second prior art, on the other hand, the first term Pr.sub.1 of the argument of the function argmax in the final expression of the Equation (2) is modeled by RNN and the second term Pr.sub.2 is modeled by HMM, so that a phoneme symbol series which results in a maximum a posteriori probability Pr of phoneme symbol series can be determined by using the MLP model and the HMM model.
However, the first and second prior art, because of their using HMMs for modeling, there are such problems that the apparatuses thereof are so complex in constitution and it is extremely difficult to implement an apparatus for calculating the same a posteriori probability into an integrated circuit (hereinafter, referred to as an IC).