Voice-controlled devices, such as smart speakers, have gained popularity in recent years. These devices typically receive audio through one or more microphones, and then process the received audio input to detect human speech, which may include one or more keywords and voice commands. To save power, many voice-controlled devices enter a sleep mode when inactive and wake up after a keyword is detected in the audio input to enable further audio input and voice command processing. After the wake up sequence is complete, the device may process the received audio input stream in real time. In some devices, voice commands received prior to the completion of the wake up sequence may be lost, requiring the speaker to repeat the voice command. In other devices, a processing delay may be introduced which may lead to the user slowing down or otherwise altering speech patterns so that the initial voice command may be received. There is therefore a continued need for improved systems and methods for processing voice commands in low power devices.