Speech feature extractors have been developed to generate a stream of feature vectors representative of an audio stream.
Graph-based speech recognition networks have been developed to relate a stream of speech-based feature vectors to a sequence of words of a written language.
Inference engines have been developed to iteratively traverse states of a graph-based speech recognition network in response to a stream of speech-based feature vectors to identify a corresponding sequence of words.
Speech recognition systems have been developed using weighted finite state transducers (WFSTs), including large vocabulary continuous speech recognition (LVCSR) systems.
State-based network traversal techniques have been implemented in a multi-thread fashion and in a single instruction, multiple data (SIMD) fashion. States of a speech recognition network may include self-loops, which are conventionally treated as an additional incoming loop to the corresponding states. In multi-thread and SIMD processing environments, synchronization may thus be necessary even where a state includes only one incoming arc plus a self-loop. In addition, state based SIMD traversal techniques may not fully utilize SIMD processing lanes, which may result in vector inefficiencies. This may offset benefits of SIMD processing.
Generic dynamic task scheduling techniques have been developed for multi-processor systems. Such generic techniques may not be optimal for some applications, such as traversal of speech recognition networks.
In the drawings, the leftmost digit(s) of a reference number identifies the drawing in which the reference number first appears.