Voice activity detection (VAD), also known as speech activity detection or speech detection, is a technique used in speech processing in which the presence or absence of human speech is detected. VAD may be used in a variety of applications, including noise suppressors, background noise estimators, adaptive beamformers, dynamic beam steering, always-on voice detection, and conversation-based playback management. In many of such applications, high-energy and transient background noises that are often present in an environment are impulsive in nature. Many traditional VADs rely on changes in signal level on a full-band or sub-band basis and thus often detect such impulsive noise as speech, as a signal envelope of an impulsive noise is often similar to that of speech. In addition, in many cases an impulsive noise spectrum averaged over various impulsive noise occurrences and an averaged speech spectrum may not be significantly different. Accordingly, in such systems, impulsive noise may be detected as speech, which may deteriorate system performance. For example, in a beam-steering application, false detection of an impulse noise as speech may result in steering a “look” direction of the beam-steering system in an incorrect direction even though an individual speaking is not moving relative to the audio device.