It is known to provide automatic speech recognition (ASR) for mobile devices using remotely-located speech recognition algorithms accessed via the internet. This speech recognition can been used to recognise spoken commands, for example for browsing the internet and for controlling specific functions on or via the mobile device. In order to preserve battery life, these mobile devices spend most of their time in a power saving stand-by mode. A trigger phrase may be used to wake the main processor of the device such that speaker verification (i.e. identification of the person speaking), or any other speech analysis service, can be carried out, within the main processor or by a remote analysis service.
Requiring a physical button press before using spoken commands is, in certain circumstances, undesirable, because spoken commands are of most value in cases where tactile interaction is not practical or possible. In response to this, a mobile device may have always-on voice implemented wake-up. This feature is a limited and very low-power implementation of speech recognition that only detects that a user has spoken a pre-defined phrase. This feature runs all the time and uses sufficiently little power that the device's battery life is not significantly impaired. The user can therefore wake up the device from standby by speaking a pre-defined phrase, after which that device may indicate that it is ready to receive a spoken command for interpretation by ASR.
After the device has successfully detected the wake up phrase, it typically takes a relatively significant time, for example up to one second, for the system to wake up. For example, data may be transferred by an applications processor (AP) in the mobile device to the remote ASR service. In order to save power, the AP is kept in a low power state, and must be woken up before it is ready to capture audio for onward transmission. Because of this, either the user must learn to leave a pause between the wake up phrase and the ASR command to avoid truncation of the start of the ASR command, or a buffer must be implemented to store the audio capture whilst the AP is waking. The latter would require a relatively large amount of data memory and the former would result in a highly unnatural speech pattern which would be undesirable to users.