Automatic speech recognition is a field with a multitude of possible applications. In order to recognize the speech, sound must be identified from a speech signal. The formant frequencies are very important cues for the recognition of speech sounds. The formant frequencies depend on the shape of the vocal tract and are the resonances of the vocal tract. The formant tracks may also be used to develop formant based speech synthesis systems that learn to produce the speech sounds by extracting the formant tracks from examples and then reproducing the speech sounds.
Only few attempts were made to use Bayesian techniques to track formants. See Y. Zheng and M. Hasegawa-Johnson, “Particle Filtering Approach to Bayesian Formant Tracking,” IEEE Workshop on Statistical Signal Processing, pp. 601-604, 2003. Most of such attempts, however, use single tracker instances for each formant and thus perform an independent formant tracking.