In order to extract information from speech there has been the need to build speech recognizers for a particular language by building models for the basic sound units of the language, phonemes, and also the words of the language. In the case of a phoneme recognition system, speech data that has been transcribed into phonemes is required. In order to construct a recognizer that produces words, speech needs to be transcribed into the words of the language. Along with the word transcriptions, a word recognizer requires a dictionary that spells words in terms of sequences of phonemes of the language in order to build the recognizer. In those cases where transcriptions and dictionaries are not available, it is not possible to employ conventional recognition methods for the purposes of information extraction.
Conventional recognition methods need not only speech transcriptions from the language in question but may also need transcriptions from the particular domain of interest (e.g., land-line, cell phone, broadcast, regional dialect). Also, use of conventional methods are not possible if the language in question does not have a written form.