Key phrase or hot word detection systems may be used to detect a word or phrase or the like, which may initiate an activity by a device. For example, the device may wake (e.g., transition from a low power or sleep mode to an active mode) based on the detection of a particular word or phrase.
Current key phrase detection systems may model context-dependent phones of key phrases and may use Gaussian mixture models (GMMs) to model the acoustics of the variations. Such systems may include a model for the key phrase and a model for non-key phrases. However, such models are too complex for implementation in low resource (e.g., compute resource, memory resource, and power resource) environments. Simpler techniques that use less resources such as less power may be used in such low resource environments. However current low resource techniques have problems with robustness (e.g., noise, false accepts, and the like).
As such, existing techniques do not provide high quality low resource key phrase detection. Such problems may become critical as the desire to implement key phrase detection systems such as wake on voice systems becomes more widespread.